martinhillebrandtd commited on
Commit
0616721
·
1 Parent(s): 68feece
README.md CHANGED
@@ -1,3 +1,2653 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ model-index:
6
+ - name: bge-base-en-v1.5
7
+ results:
8
+ - dataset:
9
+ config: en
10
+ name: MTEB AmazonCounterfactualClassification (en)
11
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
12
+ split: test
13
+ type: mteb/amazon_counterfactual
14
+ metrics:
15
+ - type: accuracy
16
+ value: 76.14925373134328
17
+ - type: ap
18
+ value: 39.32336517995478
19
+ - type: f1
20
+ value: 70.16902252611425
21
+ task:
22
+ type: Classification
23
+ - dataset:
24
+ config: default
25
+ name: MTEB AmazonPolarityClassification
26
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
27
+ split: test
28
+ type: mteb/amazon_polarity
29
+ metrics:
30
+ - type: accuracy
31
+ value: 93.386825
32
+ - type: ap
33
+ value: 90.21276917991995
34
+ - type: f1
35
+ value: 93.37741030006174
36
+ task:
37
+ type: Classification
38
+ - dataset:
39
+ config: en
40
+ name: MTEB AmazonReviewsClassification (en)
41
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
42
+ split: test
43
+ type: mteb/amazon_reviews_multi
44
+ metrics:
45
+ - type: accuracy
46
+ value: 48.846000000000004
47
+ - type: f1
48
+ value: 48.14646269778261
49
+ task:
50
+ type: Classification
51
+ - dataset:
52
+ config: default
53
+ name: MTEB ArguAna
54
+ revision: None
55
+ split: test
56
+ type: arguana
57
+ metrics:
58
+ - type: map_at_1
59
+ value: 40.754000000000005
60
+ - type: map_at_10
61
+ value: 55.761
62
+ - type: map_at_100
63
+ value: 56.330999999999996
64
+ - type: map_at_1000
65
+ value: 56.333999999999996
66
+ - type: map_at_3
67
+ value: 51.92
68
+ - type: map_at_5
69
+ value: 54.010999999999996
70
+ - type: mrr_at_1
71
+ value: 41.181
72
+ - type: mrr_at_10
73
+ value: 55.967999999999996
74
+ - type: mrr_at_100
75
+ value: 56.538
76
+ - type: mrr_at_1000
77
+ value: 56.542
78
+ - type: mrr_at_3
79
+ value: 51.980000000000004
80
+ - type: mrr_at_5
81
+ value: 54.208999999999996
82
+ - type: ndcg_at_1
83
+ value: 40.754000000000005
84
+ - type: ndcg_at_10
85
+ value: 63.605000000000004
86
+ - type: ndcg_at_100
87
+ value: 66.05199999999999
88
+ - type: ndcg_at_1000
89
+ value: 66.12
90
+ - type: ndcg_at_3
91
+ value: 55.708
92
+ - type: ndcg_at_5
93
+ value: 59.452000000000005
94
+ - type: precision_at_1
95
+ value: 40.754000000000005
96
+ - type: precision_at_10
97
+ value: 8.841000000000001
98
+ - type: precision_at_100
99
+ value: 0.991
100
+ - type: precision_at_1000
101
+ value: 0.1
102
+ - type: precision_at_3
103
+ value: 22.238
104
+ - type: precision_at_5
105
+ value: 15.149000000000001
106
+ - type: recall_at_1
107
+ value: 40.754000000000005
108
+ - type: recall_at_10
109
+ value: 88.407
110
+ - type: recall_at_100
111
+ value: 99.14699999999999
112
+ - type: recall_at_1000
113
+ value: 99.644
114
+ - type: recall_at_3
115
+ value: 66.714
116
+ - type: recall_at_5
117
+ value: 75.747
118
+ task:
119
+ type: Retrieval
120
+ - dataset:
121
+ config: default
122
+ name: MTEB ArxivClusteringP2P
123
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
124
+ split: test
125
+ type: mteb/arxiv-clustering-p2p
126
+ metrics:
127
+ - type: v_measure
128
+ value: 48.74884539679369
129
+ task:
130
+ type: Clustering
131
+ - dataset:
132
+ config: default
133
+ name: MTEB ArxivClusteringS2S
134
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
135
+ split: test
136
+ type: mteb/arxiv-clustering-s2s
137
+ metrics:
138
+ - type: v_measure
139
+ value: 42.8075893810716
140
+ task:
141
+ type: Clustering
142
+ - dataset:
143
+ config: default
144
+ name: MTEB AskUbuntuDupQuestions
145
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
146
+ split: test
147
+ type: mteb/askubuntudupquestions-reranking
148
+ metrics:
149
+ - type: map
150
+ value: 62.128470519187736
151
+ - type: mrr
152
+ value: 74.28065778481289
153
+ task:
154
+ type: Reranking
155
+ - dataset:
156
+ config: default
157
+ name: MTEB BIOSSES
158
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
159
+ split: test
160
+ type: mteb/biosses-sts
161
+ metrics:
162
+ - type: cos_sim_pearson
163
+ value: 89.24629081484655
164
+ - type: cos_sim_spearman
165
+ value: 86.93752309911496
166
+ - type: euclidean_pearson
167
+ value: 87.58589628573816
168
+ - type: euclidean_spearman
169
+ value: 88.05622328825284
170
+ - type: manhattan_pearson
171
+ value: 87.5594959805773
172
+ - type: manhattan_spearman
173
+ value: 88.19658793233961
174
+ task:
175
+ type: STS
176
+ - dataset:
177
+ config: default
178
+ name: MTEB Banking77Classification
179
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
180
+ split: test
181
+ type: mteb/banking77
182
+ metrics:
183
+ - type: accuracy
184
+ value: 86.9512987012987
185
+ - type: f1
186
+ value: 86.92515357973708
187
+ task:
188
+ type: Classification
189
+ - dataset:
190
+ config: default
191
+ name: MTEB BiorxivClusteringP2P
192
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
193
+ split: test
194
+ type: mteb/biorxiv-clustering-p2p
195
+ metrics:
196
+ - type: v_measure
197
+ value: 39.10263762928872
198
+ task:
199
+ type: Clustering
200
+ - dataset:
201
+ config: default
202
+ name: MTEB BiorxivClusteringS2S
203
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
204
+ split: test
205
+ type: mteb/biorxiv-clustering-s2s
206
+ metrics:
207
+ - type: v_measure
208
+ value: 36.69711517426737
209
+ task:
210
+ type: Clustering
211
+ - dataset:
212
+ config: default
213
+ name: MTEB CQADupstackAndroidRetrieval
214
+ revision: None
215
+ split: test
216
+ type: BeIR/cqadupstack
217
+ metrics:
218
+ - type: map_at_1
219
+ value: 32.327
220
+ - type: map_at_10
221
+ value: 44.099
222
+ - type: map_at_100
223
+ value: 45.525
224
+ - type: map_at_1000
225
+ value: 45.641999999999996
226
+ - type: map_at_3
227
+ value: 40.47
228
+ - type: map_at_5
229
+ value: 42.36
230
+ - type: mrr_at_1
231
+ value: 39.199
232
+ - type: mrr_at_10
233
+ value: 49.651
234
+ - type: mrr_at_100
235
+ value: 50.29
236
+ - type: mrr_at_1000
237
+ value: 50.329
238
+ - type: mrr_at_3
239
+ value: 46.924
240
+ - type: mrr_at_5
241
+ value: 48.548
242
+ - type: ndcg_at_1
243
+ value: 39.199
244
+ - type: ndcg_at_10
245
+ value: 50.773
246
+ - type: ndcg_at_100
247
+ value: 55.67999999999999
248
+ - type: ndcg_at_1000
249
+ value: 57.495
250
+ - type: ndcg_at_3
251
+ value: 45.513999999999996
252
+ - type: ndcg_at_5
253
+ value: 47.703
254
+ - type: precision_at_1
255
+ value: 39.199
256
+ - type: precision_at_10
257
+ value: 9.914000000000001
258
+ - type: precision_at_100
259
+ value: 1.5310000000000001
260
+ - type: precision_at_1000
261
+ value: 0.198
262
+ - type: precision_at_3
263
+ value: 21.984
264
+ - type: precision_at_5
265
+ value: 15.737000000000002
266
+ - type: recall_at_1
267
+ value: 32.327
268
+ - type: recall_at_10
269
+ value: 63.743
270
+ - type: recall_at_100
271
+ value: 84.538
272
+ - type: recall_at_1000
273
+ value: 96.089
274
+ - type: recall_at_3
275
+ value: 48.065000000000005
276
+ - type: recall_at_5
277
+ value: 54.519
278
+ - type: map_at_1
279
+ value: 32.671
280
+ - type: map_at_10
281
+ value: 42.954
282
+ - type: map_at_100
283
+ value: 44.151
284
+ - type: map_at_1000
285
+ value: 44.287
286
+ - type: map_at_3
287
+ value: 39.912
288
+ - type: map_at_5
289
+ value: 41.798
290
+ - type: mrr_at_1
291
+ value: 41.465
292
+ - type: mrr_at_10
293
+ value: 49.351
294
+ - type: mrr_at_100
295
+ value: 49.980000000000004
296
+ - type: mrr_at_1000
297
+ value: 50.016000000000005
298
+ - type: mrr_at_3
299
+ value: 47.144000000000005
300
+ - type: mrr_at_5
301
+ value: 48.592999999999996
302
+ - type: ndcg_at_1
303
+ value: 41.465
304
+ - type: ndcg_at_10
305
+ value: 48.565999999999995
306
+ - type: ndcg_at_100
307
+ value: 52.76499999999999
308
+ - type: ndcg_at_1000
309
+ value: 54.749
310
+ - type: ndcg_at_3
311
+ value: 44.57
312
+ - type: ndcg_at_5
313
+ value: 46.759
314
+ - type: precision_at_1
315
+ value: 41.465
316
+ - type: precision_at_10
317
+ value: 9.107999999999999
318
+ - type: precision_at_100
319
+ value: 1.433
320
+ - type: precision_at_1000
321
+ value: 0.191
322
+ - type: precision_at_3
323
+ value: 21.423000000000002
324
+ - type: precision_at_5
325
+ value: 15.414
326
+ - type: recall_at_1
327
+ value: 32.671
328
+ - type: recall_at_10
329
+ value: 57.738
330
+ - type: recall_at_100
331
+ value: 75.86500000000001
332
+ - type: recall_at_1000
333
+ value: 88.36
334
+ - type: recall_at_3
335
+ value: 45.626
336
+ - type: recall_at_5
337
+ value: 51.812000000000005
338
+ - type: map_at_1
339
+ value: 41.185
340
+ - type: map_at_10
341
+ value: 53.929
342
+ - type: map_at_100
343
+ value: 54.92
344
+ - type: map_at_1000
345
+ value: 54.967999999999996
346
+ - type: map_at_3
347
+ value: 50.70400000000001
348
+ - type: map_at_5
349
+ value: 52.673
350
+ - type: mrr_at_1
351
+ value: 47.398
352
+ - type: mrr_at_10
353
+ value: 57.303000000000004
354
+ - type: mrr_at_100
355
+ value: 57.959
356
+ - type: mrr_at_1000
357
+ value: 57.985
358
+ - type: mrr_at_3
359
+ value: 54.932
360
+ - type: mrr_at_5
361
+ value: 56.464999999999996
362
+ - type: ndcg_at_1
363
+ value: 47.398
364
+ - type: ndcg_at_10
365
+ value: 59.653
366
+ - type: ndcg_at_100
367
+ value: 63.627
368
+ - type: ndcg_at_1000
369
+ value: 64.596
370
+ - type: ndcg_at_3
371
+ value: 54.455
372
+ - type: ndcg_at_5
373
+ value: 57.245000000000005
374
+ - type: precision_at_1
375
+ value: 47.398
376
+ - type: precision_at_10
377
+ value: 9.524000000000001
378
+ - type: precision_at_100
379
+ value: 1.243
380
+ - type: precision_at_1000
381
+ value: 0.13699999999999998
382
+ - type: precision_at_3
383
+ value: 24.389
384
+ - type: precision_at_5
385
+ value: 16.752
386
+ - type: recall_at_1
387
+ value: 41.185
388
+ - type: recall_at_10
389
+ value: 73.193
390
+ - type: recall_at_100
391
+ value: 90.357
392
+ - type: recall_at_1000
393
+ value: 97.253
394
+ - type: recall_at_3
395
+ value: 59.199999999999996
396
+ - type: recall_at_5
397
+ value: 66.118
398
+ - type: map_at_1
399
+ value: 27.27
400
+ - type: map_at_10
401
+ value: 36.223
402
+ - type: map_at_100
403
+ value: 37.218
404
+ - type: map_at_1000
405
+ value: 37.293
406
+ - type: map_at_3
407
+ value: 33.503
408
+ - type: map_at_5
409
+ value: 35.097
410
+ - type: mrr_at_1
411
+ value: 29.492
412
+ - type: mrr_at_10
413
+ value: 38.352000000000004
414
+ - type: mrr_at_100
415
+ value: 39.188
416
+ - type: mrr_at_1000
417
+ value: 39.247
418
+ - type: mrr_at_3
419
+ value: 35.876000000000005
420
+ - type: mrr_at_5
421
+ value: 37.401
422
+ - type: ndcg_at_1
423
+ value: 29.492
424
+ - type: ndcg_at_10
425
+ value: 41.239
426
+ - type: ndcg_at_100
427
+ value: 46.066
428
+ - type: ndcg_at_1000
429
+ value: 47.992000000000004
430
+ - type: ndcg_at_3
431
+ value: 36.11
432
+ - type: ndcg_at_5
433
+ value: 38.772
434
+ - type: precision_at_1
435
+ value: 29.492
436
+ - type: precision_at_10
437
+ value: 6.260000000000001
438
+ - type: precision_at_100
439
+ value: 0.914
440
+ - type: precision_at_1000
441
+ value: 0.11100000000000002
442
+ - type: precision_at_3
443
+ value: 15.104000000000001
444
+ - type: precision_at_5
445
+ value: 10.644
446
+ - type: recall_at_1
447
+ value: 27.27
448
+ - type: recall_at_10
449
+ value: 54.589
450
+ - type: recall_at_100
451
+ value: 76.70700000000001
452
+ - type: recall_at_1000
453
+ value: 91.158
454
+ - type: recall_at_3
455
+ value: 40.974
456
+ - type: recall_at_5
457
+ value: 47.327000000000005
458
+ - type: map_at_1
459
+ value: 17.848
460
+ - type: map_at_10
461
+ value: 26.207
462
+ - type: map_at_100
463
+ value: 27.478
464
+ - type: map_at_1000
465
+ value: 27.602
466
+ - type: map_at_3
467
+ value: 23.405
468
+ - type: map_at_5
469
+ value: 24.98
470
+ - type: mrr_at_1
471
+ value: 21.891
472
+ - type: mrr_at_10
473
+ value: 31.041999999999998
474
+ - type: mrr_at_100
475
+ value: 32.092
476
+ - type: mrr_at_1000
477
+ value: 32.151999999999994
478
+ - type: mrr_at_3
479
+ value: 28.358
480
+ - type: mrr_at_5
481
+ value: 29.969
482
+ - type: ndcg_at_1
483
+ value: 21.891
484
+ - type: ndcg_at_10
485
+ value: 31.585
486
+ - type: ndcg_at_100
487
+ value: 37.531
488
+ - type: ndcg_at_1000
489
+ value: 40.256
490
+ - type: ndcg_at_3
491
+ value: 26.508
492
+ - type: ndcg_at_5
493
+ value: 28.894
494
+ - type: precision_at_1
495
+ value: 21.891
496
+ - type: precision_at_10
497
+ value: 5.795999999999999
498
+ - type: precision_at_100
499
+ value: 0.9990000000000001
500
+ - type: precision_at_1000
501
+ value: 0.13799999999999998
502
+ - type: precision_at_3
503
+ value: 12.769
504
+ - type: precision_at_5
505
+ value: 9.279
506
+ - type: recall_at_1
507
+ value: 17.848
508
+ - type: recall_at_10
509
+ value: 43.452
510
+ - type: recall_at_100
511
+ value: 69.216
512
+ - type: recall_at_1000
513
+ value: 88.102
514
+ - type: recall_at_3
515
+ value: 29.18
516
+ - type: recall_at_5
517
+ value: 35.347
518
+ - type: map_at_1
519
+ value: 30.94
520
+ - type: map_at_10
521
+ value: 41.248000000000005
522
+ - type: map_at_100
523
+ value: 42.495
524
+ - type: map_at_1000
525
+ value: 42.602000000000004
526
+ - type: map_at_3
527
+ value: 37.939
528
+ - type: map_at_5
529
+ value: 39.924
530
+ - type: mrr_at_1
531
+ value: 37.824999999999996
532
+ - type: mrr_at_10
533
+ value: 47.041
534
+ - type: mrr_at_100
535
+ value: 47.83
536
+ - type: mrr_at_1000
537
+ value: 47.878
538
+ - type: mrr_at_3
539
+ value: 44.466
540
+ - type: mrr_at_5
541
+ value: 46.111999999999995
542
+ - type: ndcg_at_1
543
+ value: 37.824999999999996
544
+ - type: ndcg_at_10
545
+ value: 47.223
546
+ - type: ndcg_at_100
547
+ value: 52.394
548
+ - type: ndcg_at_1000
549
+ value: 54.432
550
+ - type: ndcg_at_3
551
+ value: 42.032000000000004
552
+ - type: ndcg_at_5
553
+ value: 44.772
554
+ - type: precision_at_1
555
+ value: 37.824999999999996
556
+ - type: precision_at_10
557
+ value: 8.393
558
+ - type: precision_at_100
559
+ value: 1.2890000000000001
560
+ - type: precision_at_1000
561
+ value: 0.164
562
+ - type: precision_at_3
563
+ value: 19.698
564
+ - type: precision_at_5
565
+ value: 14.013
566
+ - type: recall_at_1
567
+ value: 30.94
568
+ - type: recall_at_10
569
+ value: 59.316
570
+ - type: recall_at_100
571
+ value: 80.783
572
+ - type: recall_at_1000
573
+ value: 94.15400000000001
574
+ - type: recall_at_3
575
+ value: 44.712
576
+ - type: recall_at_5
577
+ value: 51.932
578
+ - type: map_at_1
579
+ value: 27.104
580
+ - type: map_at_10
581
+ value: 36.675999999999995
582
+ - type: map_at_100
583
+ value: 38.076
584
+ - type: map_at_1000
585
+ value: 38.189
586
+ - type: map_at_3
587
+ value: 33.733999999999995
588
+ - type: map_at_5
589
+ value: 35.287
590
+ - type: mrr_at_1
591
+ value: 33.904
592
+ - type: mrr_at_10
593
+ value: 42.55
594
+ - type: mrr_at_100
595
+ value: 43.434
596
+ - type: mrr_at_1000
597
+ value: 43.494
598
+ - type: mrr_at_3
599
+ value: 40.126
600
+ - type: mrr_at_5
601
+ value: 41.473
602
+ - type: ndcg_at_1
603
+ value: 33.904
604
+ - type: ndcg_at_10
605
+ value: 42.414
606
+ - type: ndcg_at_100
607
+ value: 48.203
608
+ - type: ndcg_at_1000
609
+ value: 50.437
610
+ - type: ndcg_at_3
611
+ value: 37.633
612
+ - type: ndcg_at_5
613
+ value: 39.67
614
+ - type: precision_at_1
615
+ value: 33.904
616
+ - type: precision_at_10
617
+ value: 7.82
618
+ - type: precision_at_100
619
+ value: 1.2409999999999999
620
+ - type: precision_at_1000
621
+ value: 0.159
622
+ - type: precision_at_3
623
+ value: 17.884
624
+ - type: precision_at_5
625
+ value: 12.648000000000001
626
+ - type: recall_at_1
627
+ value: 27.104
628
+ - type: recall_at_10
629
+ value: 53.563
630
+ - type: recall_at_100
631
+ value: 78.557
632
+ - type: recall_at_1000
633
+ value: 93.533
634
+ - type: recall_at_3
635
+ value: 39.92
636
+ - type: recall_at_5
637
+ value: 45.457
638
+ - type: map_at_1
639
+ value: 27.707749999999997
640
+ - type: map_at_10
641
+ value: 36.961
642
+ - type: map_at_100
643
+ value: 38.158833333333334
644
+ - type: map_at_1000
645
+ value: 38.270333333333326
646
+ - type: map_at_3
647
+ value: 34.07183333333334
648
+ - type: map_at_5
649
+ value: 35.69533333333334
650
+ - type: mrr_at_1
651
+ value: 32.81875
652
+ - type: mrr_at_10
653
+ value: 41.293
654
+ - type: mrr_at_100
655
+ value: 42.116499999999995
656
+ - type: mrr_at_1000
657
+ value: 42.170249999999996
658
+ - type: mrr_at_3
659
+ value: 38.83983333333333
660
+ - type: mrr_at_5
661
+ value: 40.29775
662
+ - type: ndcg_at_1
663
+ value: 32.81875
664
+ - type: ndcg_at_10
665
+ value: 42.355
666
+ - type: ndcg_at_100
667
+ value: 47.41374999999999
668
+ - type: ndcg_at_1000
669
+ value: 49.5805
670
+ - type: ndcg_at_3
671
+ value: 37.52825
672
+ - type: ndcg_at_5
673
+ value: 39.83266666666667
674
+ - type: precision_at_1
675
+ value: 32.81875
676
+ - type: precision_at_10
677
+ value: 7.382416666666666
678
+ - type: precision_at_100
679
+ value: 1.1640833333333334
680
+ - type: precision_at_1000
681
+ value: 0.15383333333333335
682
+ - type: precision_at_3
683
+ value: 17.134166666666665
684
+ - type: precision_at_5
685
+ value: 12.174833333333336
686
+ - type: recall_at_1
687
+ value: 27.707749999999997
688
+ - type: recall_at_10
689
+ value: 53.945
690
+ - type: recall_at_100
691
+ value: 76.191
692
+ - type: recall_at_1000
693
+ value: 91.101
694
+ - type: recall_at_3
695
+ value: 40.39083333333334
696
+ - type: recall_at_5
697
+ value: 46.40083333333333
698
+ - type: map_at_1
699
+ value: 26.482
700
+ - type: map_at_10
701
+ value: 33.201
702
+ - type: map_at_100
703
+ value: 34.107
704
+ - type: map_at_1000
705
+ value: 34.197
706
+ - type: map_at_3
707
+ value: 31.174000000000003
708
+ - type: map_at_5
709
+ value: 32.279
710
+ - type: mrr_at_1
711
+ value: 29.908
712
+ - type: mrr_at_10
713
+ value: 36.235
714
+ - type: mrr_at_100
715
+ value: 37.04
716
+ - type: mrr_at_1000
717
+ value: 37.105
718
+ - type: mrr_at_3
719
+ value: 34.355999999999995
720
+ - type: mrr_at_5
721
+ value: 35.382999999999996
722
+ - type: ndcg_at_1
723
+ value: 29.908
724
+ - type: ndcg_at_10
725
+ value: 37.325
726
+ - type: ndcg_at_100
727
+ value: 41.795
728
+ - type: ndcg_at_1000
729
+ value: 44.105
730
+ - type: ndcg_at_3
731
+ value: 33.555
732
+ - type: ndcg_at_5
733
+ value: 35.266999999999996
734
+ - type: precision_at_1
735
+ value: 29.908
736
+ - type: precision_at_10
737
+ value: 5.721
738
+ - type: precision_at_100
739
+ value: 0.8630000000000001
740
+ - type: precision_at_1000
741
+ value: 0.11299999999999999
742
+ - type: precision_at_3
743
+ value: 14.008000000000001
744
+ - type: precision_at_5
745
+ value: 9.754999999999999
746
+ - type: recall_at_1
747
+ value: 26.482
748
+ - type: recall_at_10
749
+ value: 47.072
750
+ - type: recall_at_100
751
+ value: 67.27
752
+ - type: recall_at_1000
753
+ value: 84.371
754
+ - type: recall_at_3
755
+ value: 36.65
756
+ - type: recall_at_5
757
+ value: 40.774
758
+ - type: map_at_1
759
+ value: 18.815
760
+ - type: map_at_10
761
+ value: 26.369999999999997
762
+ - type: map_at_100
763
+ value: 27.458
764
+ - type: map_at_1000
765
+ value: 27.588
766
+ - type: map_at_3
767
+ value: 23.990000000000002
768
+ - type: map_at_5
769
+ value: 25.345000000000002
770
+ - type: mrr_at_1
771
+ value: 22.953000000000003
772
+ - type: mrr_at_10
773
+ value: 30.342999999999996
774
+ - type: mrr_at_100
775
+ value: 31.241000000000003
776
+ - type: mrr_at_1000
777
+ value: 31.319000000000003
778
+ - type: mrr_at_3
779
+ value: 28.16
780
+ - type: mrr_at_5
781
+ value: 29.406
782
+ - type: ndcg_at_1
783
+ value: 22.953000000000003
784
+ - type: ndcg_at_10
785
+ value: 31.151
786
+ - type: ndcg_at_100
787
+ value: 36.309000000000005
788
+ - type: ndcg_at_1000
789
+ value: 39.227000000000004
790
+ - type: ndcg_at_3
791
+ value: 26.921
792
+ - type: ndcg_at_5
793
+ value: 28.938000000000002
794
+ - type: precision_at_1
795
+ value: 22.953000000000003
796
+ - type: precision_at_10
797
+ value: 5.602
798
+ - type: precision_at_100
799
+ value: 0.9530000000000001
800
+ - type: precision_at_1000
801
+ value: 0.13899999999999998
802
+ - type: precision_at_3
803
+ value: 12.606
804
+ - type: precision_at_5
805
+ value: 9.119
806
+ - type: recall_at_1
807
+ value: 18.815
808
+ - type: recall_at_10
809
+ value: 41.574
810
+ - type: recall_at_100
811
+ value: 64.84400000000001
812
+ - type: recall_at_1000
813
+ value: 85.406
814
+ - type: recall_at_3
815
+ value: 29.694
816
+ - type: recall_at_5
817
+ value: 34.935
818
+ - type: map_at_1
819
+ value: 27.840999999999998
820
+ - type: map_at_10
821
+ value: 36.797999999999995
822
+ - type: map_at_100
823
+ value: 37.993
824
+ - type: map_at_1000
825
+ value: 38.086999999999996
826
+ - type: map_at_3
827
+ value: 34.050999999999995
828
+ - type: map_at_5
829
+ value: 35.379
830
+ - type: mrr_at_1
831
+ value: 32.649
832
+ - type: mrr_at_10
833
+ value: 41.025
834
+ - type: mrr_at_100
835
+ value: 41.878
836
+ - type: mrr_at_1000
837
+ value: 41.929
838
+ - type: mrr_at_3
839
+ value: 38.573
840
+ - type: mrr_at_5
841
+ value: 39.715
842
+ - type: ndcg_at_1
843
+ value: 32.649
844
+ - type: ndcg_at_10
845
+ value: 42.142
846
+ - type: ndcg_at_100
847
+ value: 47.558
848
+ - type: ndcg_at_1000
849
+ value: 49.643
850
+ - type: ndcg_at_3
851
+ value: 37.12
852
+ - type: ndcg_at_5
853
+ value: 38.983000000000004
854
+ - type: precision_at_1
855
+ value: 32.649
856
+ - type: precision_at_10
857
+ value: 7.08
858
+ - type: precision_at_100
859
+ value: 1.1039999999999999
860
+ - type: precision_at_1000
861
+ value: 0.13899999999999998
862
+ - type: precision_at_3
863
+ value: 16.698
864
+ - type: precision_at_5
865
+ value: 11.511000000000001
866
+ - type: recall_at_1
867
+ value: 27.840999999999998
868
+ - type: recall_at_10
869
+ value: 54.245
870
+ - type: recall_at_100
871
+ value: 77.947
872
+ - type: recall_at_1000
873
+ value: 92.36999999999999
874
+ - type: recall_at_3
875
+ value: 40.146
876
+ - type: recall_at_5
877
+ value: 44.951
878
+ - type: map_at_1
879
+ value: 26.529000000000003
880
+ - type: map_at_10
881
+ value: 35.010000000000005
882
+ - type: map_at_100
883
+ value: 36.647
884
+ - type: map_at_1000
885
+ value: 36.857
886
+ - type: map_at_3
887
+ value: 31.968000000000004
888
+ - type: map_at_5
889
+ value: 33.554
890
+ - type: mrr_at_1
891
+ value: 31.818
892
+ - type: mrr_at_10
893
+ value: 39.550999999999995
894
+ - type: mrr_at_100
895
+ value: 40.54
896
+ - type: mrr_at_1000
897
+ value: 40.596
898
+ - type: mrr_at_3
899
+ value: 36.726
900
+ - type: mrr_at_5
901
+ value: 38.416
902
+ - type: ndcg_at_1
903
+ value: 31.818
904
+ - type: ndcg_at_10
905
+ value: 40.675
906
+ - type: ndcg_at_100
907
+ value: 46.548
908
+ - type: ndcg_at_1000
909
+ value: 49.126
910
+ - type: ndcg_at_3
911
+ value: 35.829
912
+ - type: ndcg_at_5
913
+ value: 38.0
914
+ - type: precision_at_1
915
+ value: 31.818
916
+ - type: precision_at_10
917
+ value: 7.826
918
+ - type: precision_at_100
919
+ value: 1.538
920
+ - type: precision_at_1000
921
+ value: 0.24
922
+ - type: precision_at_3
923
+ value: 16.601
924
+ - type: precision_at_5
925
+ value: 12.095
926
+ - type: recall_at_1
927
+ value: 26.529000000000003
928
+ - type: recall_at_10
929
+ value: 51.03
930
+ - type: recall_at_100
931
+ value: 77.556
932
+ - type: recall_at_1000
933
+ value: 93.804
934
+ - type: recall_at_3
935
+ value: 36.986000000000004
936
+ - type: recall_at_5
937
+ value: 43.096000000000004
938
+ - type: map_at_1
939
+ value: 23.480999999999998
940
+ - type: map_at_10
941
+ value: 30.817
942
+ - type: map_at_100
943
+ value: 31.838
944
+ - type: map_at_1000
945
+ value: 31.932
946
+ - type: map_at_3
947
+ value: 28.011999999999997
948
+ - type: map_at_5
949
+ value: 29.668
950
+ - type: mrr_at_1
951
+ value: 25.323
952
+ - type: mrr_at_10
953
+ value: 33.072
954
+ - type: mrr_at_100
955
+ value: 33.926
956
+ - type: mrr_at_1000
957
+ value: 33.993
958
+ - type: mrr_at_3
959
+ value: 30.436999999999998
960
+ - type: mrr_at_5
961
+ value: 32.092
962
+ - type: ndcg_at_1
963
+ value: 25.323
964
+ - type: ndcg_at_10
965
+ value: 35.514
966
+ - type: ndcg_at_100
967
+ value: 40.489000000000004
968
+ - type: ndcg_at_1000
969
+ value: 42.908
970
+ - type: ndcg_at_3
971
+ value: 30.092000000000002
972
+ - type: ndcg_at_5
973
+ value: 32.989000000000004
974
+ - type: precision_at_1
975
+ value: 25.323
976
+ - type: precision_at_10
977
+ value: 5.545
978
+ - type: precision_at_100
979
+ value: 0.861
980
+ - type: precision_at_1000
981
+ value: 0.117
982
+ - type: precision_at_3
983
+ value: 12.446
984
+ - type: precision_at_5
985
+ value: 9.131
986
+ - type: recall_at_1
987
+ value: 23.480999999999998
988
+ - type: recall_at_10
989
+ value: 47.825
990
+ - type: recall_at_100
991
+ value: 70.652
992
+ - type: recall_at_1000
993
+ value: 88.612
994
+ - type: recall_at_3
995
+ value: 33.537
996
+ - type: recall_at_5
997
+ value: 40.542
998
+ task:
999
+ type: Retrieval
1000
+ - dataset:
1001
+ config: default
1002
+ name: MTEB ClimateFEVER
1003
+ revision: None
1004
+ split: test
1005
+ type: climate-fever
1006
+ metrics:
1007
+ - type: map_at_1
1008
+ value: 13.333999999999998
1009
+ - type: map_at_10
1010
+ value: 22.524
1011
+ - type: map_at_100
1012
+ value: 24.506
1013
+ - type: map_at_1000
1014
+ value: 24.715
1015
+ - type: map_at_3
1016
+ value: 19.022
1017
+ - type: map_at_5
1018
+ value: 20.693
1019
+ - type: mrr_at_1
1020
+ value: 29.186
1021
+ - type: mrr_at_10
1022
+ value: 41.22
1023
+ - type: mrr_at_100
1024
+ value: 42.16
1025
+ - type: mrr_at_1000
1026
+ value: 42.192
1027
+ - type: mrr_at_3
1028
+ value: 38.013000000000005
1029
+ - type: mrr_at_5
1030
+ value: 39.704
1031
+ - type: ndcg_at_1
1032
+ value: 29.186
1033
+ - type: ndcg_at_10
1034
+ value: 31.167
1035
+ - type: ndcg_at_100
1036
+ value: 38.879000000000005
1037
+ - type: ndcg_at_1000
1038
+ value: 42.376000000000005
1039
+ - type: ndcg_at_3
1040
+ value: 25.817
1041
+ - type: ndcg_at_5
1042
+ value: 27.377000000000002
1043
+ - type: precision_at_1
1044
+ value: 29.186
1045
+ - type: precision_at_10
1046
+ value: 9.693999999999999
1047
+ - type: precision_at_100
1048
+ value: 1.8030000000000002
1049
+ - type: precision_at_1000
1050
+ value: 0.246
1051
+ - type: precision_at_3
1052
+ value: 19.11
1053
+ - type: precision_at_5
1054
+ value: 14.344999999999999
1055
+ - type: recall_at_1
1056
+ value: 13.333999999999998
1057
+ - type: recall_at_10
1058
+ value: 37.092000000000006
1059
+ - type: recall_at_100
1060
+ value: 63.651
1061
+ - type: recall_at_1000
1062
+ value: 83.05
1063
+ - type: recall_at_3
1064
+ value: 23.74
1065
+ - type: recall_at_5
1066
+ value: 28.655
1067
+ task:
1068
+ type: Retrieval
1069
+ - dataset:
1070
+ config: default
1071
+ name: MTEB DBPedia
1072
+ revision: None
1073
+ split: test
1074
+ type: dbpedia-entity
1075
+ metrics:
1076
+ - type: map_at_1
1077
+ value: 9.151
1078
+ - type: map_at_10
1079
+ value: 19.653000000000002
1080
+ - type: map_at_100
1081
+ value: 28.053
1082
+ - type: map_at_1000
1083
+ value: 29.709000000000003
1084
+ - type: map_at_3
1085
+ value: 14.191
1086
+ - type: map_at_5
1087
+ value: 16.456
1088
+ - type: mrr_at_1
1089
+ value: 66.25
1090
+ - type: mrr_at_10
1091
+ value: 74.4
1092
+ - type: mrr_at_100
1093
+ value: 74.715
1094
+ - type: mrr_at_1000
1095
+ value: 74.726
1096
+ - type: mrr_at_3
1097
+ value: 72.417
1098
+ - type: mrr_at_5
1099
+ value: 73.667
1100
+ - type: ndcg_at_1
1101
+ value: 54.25
1102
+ - type: ndcg_at_10
1103
+ value: 40.77
1104
+ - type: ndcg_at_100
1105
+ value: 46.359
1106
+ - type: ndcg_at_1000
1107
+ value: 54.193000000000005
1108
+ - type: ndcg_at_3
1109
+ value: 44.832
1110
+ - type: ndcg_at_5
1111
+ value: 42.63
1112
+ - type: precision_at_1
1113
+ value: 66.25
1114
+ - type: precision_at_10
1115
+ value: 32.175
1116
+ - type: precision_at_100
1117
+ value: 10.668
1118
+ - type: precision_at_1000
1119
+ value: 2.067
1120
+ - type: precision_at_3
1121
+ value: 47.667
1122
+ - type: precision_at_5
1123
+ value: 41.3
1124
+ - type: recall_at_1
1125
+ value: 9.151
1126
+ - type: recall_at_10
1127
+ value: 25.003999999999998
1128
+ - type: recall_at_100
1129
+ value: 52.976
1130
+ - type: recall_at_1000
1131
+ value: 78.315
1132
+ - type: recall_at_3
1133
+ value: 15.487
1134
+ - type: recall_at_5
1135
+ value: 18.999
1136
+ task:
1137
+ type: Retrieval
1138
+ - dataset:
1139
+ config: default
1140
+ name: MTEB EmotionClassification
1141
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1142
+ split: test
1143
+ type: mteb/emotion
1144
+ metrics:
1145
+ - type: accuracy
1146
+ value: 51.89999999999999
1147
+ - type: f1
1148
+ value: 46.47777925067403
1149
+ task:
1150
+ type: Classification
1151
+ - dataset:
1152
+ config: default
1153
+ name: MTEB FEVER
1154
+ revision: None
1155
+ split: test
1156
+ type: fever
1157
+ metrics:
1158
+ - type: map_at_1
1159
+ value: 73.706
1160
+ - type: map_at_10
1161
+ value: 82.423
1162
+ - type: map_at_100
1163
+ value: 82.67999999999999
1164
+ - type: map_at_1000
1165
+ value: 82.694
1166
+ - type: map_at_3
1167
+ value: 81.328
1168
+ - type: map_at_5
1169
+ value: 82.001
1170
+ - type: mrr_at_1
1171
+ value: 79.613
1172
+ - type: mrr_at_10
1173
+ value: 87.07000000000001
1174
+ - type: mrr_at_100
1175
+ value: 87.169
1176
+ - type: mrr_at_1000
1177
+ value: 87.17
1178
+ - type: mrr_at_3
1179
+ value: 86.404
1180
+ - type: mrr_at_5
1181
+ value: 86.856
1182
+ - type: ndcg_at_1
1183
+ value: 79.613
1184
+ - type: ndcg_at_10
1185
+ value: 86.289
1186
+ - type: ndcg_at_100
1187
+ value: 87.201
1188
+ - type: ndcg_at_1000
1189
+ value: 87.428
1190
+ - type: ndcg_at_3
1191
+ value: 84.625
1192
+ - type: ndcg_at_5
1193
+ value: 85.53699999999999
1194
+ - type: precision_at_1
1195
+ value: 79.613
1196
+ - type: precision_at_10
1197
+ value: 10.399
1198
+ - type: precision_at_100
1199
+ value: 1.1079999999999999
1200
+ - type: precision_at_1000
1201
+ value: 0.11499999999999999
1202
+ - type: precision_at_3
1203
+ value: 32.473
1204
+ - type: precision_at_5
1205
+ value: 20.132
1206
+ - type: recall_at_1
1207
+ value: 73.706
1208
+ - type: recall_at_10
1209
+ value: 93.559
1210
+ - type: recall_at_100
1211
+ value: 97.188
1212
+ - type: recall_at_1000
1213
+ value: 98.555
1214
+ - type: recall_at_3
1215
+ value: 88.98700000000001
1216
+ - type: recall_at_5
1217
+ value: 91.373
1218
+ task:
1219
+ type: Retrieval
1220
+ - dataset:
1221
+ config: default
1222
+ name: MTEB FiQA2018
1223
+ revision: None
1224
+ split: test
1225
+ type: fiqa
1226
+ metrics:
1227
+ - type: map_at_1
1228
+ value: 19.841
1229
+ - type: map_at_10
1230
+ value: 32.643
1231
+ - type: map_at_100
1232
+ value: 34.575
1233
+ - type: map_at_1000
1234
+ value: 34.736
1235
+ - type: map_at_3
1236
+ value: 28.317999999999998
1237
+ - type: map_at_5
1238
+ value: 30.964000000000002
1239
+ - type: mrr_at_1
1240
+ value: 39.660000000000004
1241
+ - type: mrr_at_10
1242
+ value: 48.620000000000005
1243
+ - type: mrr_at_100
1244
+ value: 49.384
1245
+ - type: mrr_at_1000
1246
+ value: 49.415
1247
+ - type: mrr_at_3
1248
+ value: 45.988
1249
+ - type: mrr_at_5
1250
+ value: 47.361
1251
+ - type: ndcg_at_1
1252
+ value: 39.660000000000004
1253
+ - type: ndcg_at_10
1254
+ value: 40.646
1255
+ - type: ndcg_at_100
1256
+ value: 47.657
1257
+ - type: ndcg_at_1000
1258
+ value: 50.428
1259
+ - type: ndcg_at_3
1260
+ value: 36.689
1261
+ - type: ndcg_at_5
1262
+ value: 38.211
1263
+ - type: precision_at_1
1264
+ value: 39.660000000000004
1265
+ - type: precision_at_10
1266
+ value: 11.235000000000001
1267
+ - type: precision_at_100
1268
+ value: 1.8530000000000002
1269
+ - type: precision_at_1000
1270
+ value: 0.23600000000000002
1271
+ - type: precision_at_3
1272
+ value: 24.587999999999997
1273
+ - type: precision_at_5
1274
+ value: 18.395
1275
+ - type: recall_at_1
1276
+ value: 19.841
1277
+ - type: recall_at_10
1278
+ value: 48.135
1279
+ - type: recall_at_100
1280
+ value: 74.224
1281
+ - type: recall_at_1000
1282
+ value: 90.826
1283
+ - type: recall_at_3
1284
+ value: 33.536
1285
+ - type: recall_at_5
1286
+ value: 40.311
1287
+ task:
1288
+ type: Retrieval
1289
+ - dataset:
1290
+ config: default
1291
+ name: MTEB HotpotQA
1292
+ revision: None
1293
+ split: test
1294
+ type: hotpotqa
1295
+ metrics:
1296
+ - type: map_at_1
1297
+ value: 40.358
1298
+ - type: map_at_10
1299
+ value: 64.497
1300
+ - type: map_at_100
1301
+ value: 65.362
1302
+ - type: map_at_1000
1303
+ value: 65.41900000000001
1304
+ - type: map_at_3
1305
+ value: 61.06700000000001
1306
+ - type: map_at_5
1307
+ value: 63.317
1308
+ - type: mrr_at_1
1309
+ value: 80.716
1310
+ - type: mrr_at_10
1311
+ value: 86.10799999999999
1312
+ - type: mrr_at_100
1313
+ value: 86.265
1314
+ - type: mrr_at_1000
1315
+ value: 86.27
1316
+ - type: mrr_at_3
1317
+ value: 85.271
1318
+ - type: mrr_at_5
1319
+ value: 85.82499999999999
1320
+ - type: ndcg_at_1
1321
+ value: 80.716
1322
+ - type: ndcg_at_10
1323
+ value: 72.597
1324
+ - type: ndcg_at_100
1325
+ value: 75.549
1326
+ - type: ndcg_at_1000
1327
+ value: 76.61
1328
+ - type: ndcg_at_3
1329
+ value: 67.874
1330
+ - type: ndcg_at_5
1331
+ value: 70.655
1332
+ - type: precision_at_1
1333
+ value: 80.716
1334
+ - type: precision_at_10
1335
+ value: 15.148
1336
+ - type: precision_at_100
1337
+ value: 1.745
1338
+ - type: precision_at_1000
1339
+ value: 0.188
1340
+ - type: precision_at_3
1341
+ value: 43.597
1342
+ - type: precision_at_5
1343
+ value: 28.351
1344
+ - type: recall_at_1
1345
+ value: 40.358
1346
+ - type: recall_at_10
1347
+ value: 75.739
1348
+ - type: recall_at_100
1349
+ value: 87.259
1350
+ - type: recall_at_1000
1351
+ value: 94.234
1352
+ - type: recall_at_3
1353
+ value: 65.39500000000001
1354
+ - type: recall_at_5
1355
+ value: 70.878
1356
+ task:
1357
+ type: Retrieval
1358
+ - dataset:
1359
+ config: default
1360
+ name: MTEB ImdbClassification
1361
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1362
+ split: test
1363
+ type: mteb/imdb
1364
+ metrics:
1365
+ - type: accuracy
1366
+ value: 90.80799999999998
1367
+ - type: ap
1368
+ value: 86.81350378180757
1369
+ - type: f1
1370
+ value: 90.79901248314215
1371
+ task:
1372
+ type: Classification
1373
+ - dataset:
1374
+ config: default
1375
+ name: MTEB MSMARCO
1376
+ revision: None
1377
+ split: dev
1378
+ type: msmarco
1379
+ metrics:
1380
+ - type: map_at_1
1381
+ value: 22.096
1382
+ - type: map_at_10
1383
+ value: 34.384
1384
+ - type: map_at_100
1385
+ value: 35.541
1386
+ - type: map_at_1000
1387
+ value: 35.589999999999996
1388
+ - type: map_at_3
1389
+ value: 30.496000000000002
1390
+ - type: map_at_5
1391
+ value: 32.718
1392
+ - type: mrr_at_1
1393
+ value: 22.750999999999998
1394
+ - type: mrr_at_10
1395
+ value: 35.024
1396
+ - type: mrr_at_100
1397
+ value: 36.125
1398
+ - type: mrr_at_1000
1399
+ value: 36.168
1400
+ - type: mrr_at_3
1401
+ value: 31.225
1402
+ - type: mrr_at_5
1403
+ value: 33.416000000000004
1404
+ - type: ndcg_at_1
1405
+ value: 22.750999999999998
1406
+ - type: ndcg_at_10
1407
+ value: 41.351
1408
+ - type: ndcg_at_100
1409
+ value: 46.92
1410
+ - type: ndcg_at_1000
1411
+ value: 48.111
1412
+ - type: ndcg_at_3
1413
+ value: 33.439
1414
+ - type: ndcg_at_5
1415
+ value: 37.407000000000004
1416
+ - type: precision_at_1
1417
+ value: 22.750999999999998
1418
+ - type: precision_at_10
1419
+ value: 6.564
1420
+ - type: precision_at_100
1421
+ value: 0.935
1422
+ - type: precision_at_1000
1423
+ value: 0.104
1424
+ - type: precision_at_3
1425
+ value: 14.288
1426
+ - type: precision_at_5
1427
+ value: 10.581999999999999
1428
+ - type: recall_at_1
1429
+ value: 22.096
1430
+ - type: recall_at_10
1431
+ value: 62.771
1432
+ - type: recall_at_100
1433
+ value: 88.529
1434
+ - type: recall_at_1000
1435
+ value: 97.55
1436
+ - type: recall_at_3
1437
+ value: 41.245
1438
+ - type: recall_at_5
1439
+ value: 50.788
1440
+ task:
1441
+ type: Retrieval
1442
+ - dataset:
1443
+ config: en
1444
+ name: MTEB MTOPDomainClassification (en)
1445
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1446
+ split: test
1447
+ type: mteb/mtop_domain
1448
+ metrics:
1449
+ - type: accuracy
1450
+ value: 94.16780665754673
1451
+ - type: f1
1452
+ value: 93.96331194859894
1453
+ task:
1454
+ type: Classification
1455
+ - dataset:
1456
+ config: en
1457
+ name: MTEB MTOPIntentClassification (en)
1458
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1459
+ split: test
1460
+ type: mteb/mtop_intent
1461
+ metrics:
1462
+ - type: accuracy
1463
+ value: 76.90606475148198
1464
+ - type: f1
1465
+ value: 58.58344986604187
1466
+ task:
1467
+ type: Classification
1468
+ - dataset:
1469
+ config: en
1470
+ name: MTEB MassiveIntentClassification (en)
1471
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1472
+ split: test
1473
+ type: mteb/amazon_massive_intent
1474
+ metrics:
1475
+ - type: accuracy
1476
+ value: 76.14660390047075
1477
+ - type: f1
1478
+ value: 74.31533923533614
1479
+ task:
1480
+ type: Classification
1481
+ - dataset:
1482
+ config: en
1483
+ name: MTEB MassiveScenarioClassification (en)
1484
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1485
+ split: test
1486
+ type: mteb/amazon_massive_scenario
1487
+ metrics:
1488
+ - type: accuracy
1489
+ value: 80.16139878950908
1490
+ - type: f1
1491
+ value: 80.18532656824924
1492
+ task:
1493
+ type: Classification
1494
+ - dataset:
1495
+ config: default
1496
+ name: MTEB MedrxivClusteringP2P
1497
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1498
+ split: test
1499
+ type: mteb/medrxiv-clustering-p2p
1500
+ metrics:
1501
+ - type: v_measure
1502
+ value: 32.949880906135085
1503
+ task:
1504
+ type: Clustering
1505
+ - dataset:
1506
+ config: default
1507
+ name: MTEB MedrxivClusteringS2S
1508
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1509
+ split: test
1510
+ type: mteb/medrxiv-clustering-s2s
1511
+ metrics:
1512
+ - type: v_measure
1513
+ value: 31.56300351524862
1514
+ task:
1515
+ type: Clustering
1516
+ - dataset:
1517
+ config: default
1518
+ name: MTEB MindSmallReranking
1519
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1520
+ split: test
1521
+ type: mteb/mind_small
1522
+ metrics:
1523
+ - type: map
1524
+ value: 31.196521894371315
1525
+ - type: mrr
1526
+ value: 32.22644231694389
1527
+ task:
1528
+ type: Reranking
1529
+ - dataset:
1530
+ config: default
1531
+ name: MTEB NFCorpus
1532
+ revision: None
1533
+ split: test
1534
+ type: nfcorpus
1535
+ metrics:
1536
+ - type: map_at_1
1537
+ value: 6.783
1538
+ - type: map_at_10
1539
+ value: 14.549000000000001
1540
+ - type: map_at_100
1541
+ value: 18.433
1542
+ - type: map_at_1000
1543
+ value: 19.949
1544
+ - type: map_at_3
1545
+ value: 10.936
1546
+ - type: map_at_5
1547
+ value: 12.514
1548
+ - type: mrr_at_1
1549
+ value: 47.368
1550
+ - type: mrr_at_10
1551
+ value: 56.42
1552
+ - type: mrr_at_100
1553
+ value: 56.908
1554
+ - type: mrr_at_1000
1555
+ value: 56.95
1556
+ - type: mrr_at_3
1557
+ value: 54.283
1558
+ - type: mrr_at_5
1559
+ value: 55.568
1560
+ - type: ndcg_at_1
1561
+ value: 45.666000000000004
1562
+ - type: ndcg_at_10
1563
+ value: 37.389
1564
+ - type: ndcg_at_100
1565
+ value: 34.253
1566
+ - type: ndcg_at_1000
1567
+ value: 43.059999999999995
1568
+ - type: ndcg_at_3
1569
+ value: 42.725
1570
+ - type: ndcg_at_5
1571
+ value: 40.193
1572
+ - type: precision_at_1
1573
+ value: 47.368
1574
+ - type: precision_at_10
1575
+ value: 27.988000000000003
1576
+ - type: precision_at_100
1577
+ value: 8.672
1578
+ - type: precision_at_1000
1579
+ value: 2.164
1580
+ - type: precision_at_3
1581
+ value: 40.248
1582
+ - type: precision_at_5
1583
+ value: 34.737
1584
+ - type: recall_at_1
1585
+ value: 6.783
1586
+ - type: recall_at_10
1587
+ value: 17.838
1588
+ - type: recall_at_100
1589
+ value: 33.672000000000004
1590
+ - type: recall_at_1000
1591
+ value: 66.166
1592
+ - type: recall_at_3
1593
+ value: 11.849
1594
+ - type: recall_at_5
1595
+ value: 14.205000000000002
1596
+ task:
1597
+ type: Retrieval
1598
+ - dataset:
1599
+ config: default
1600
+ name: MTEB NQ
1601
+ revision: None
1602
+ split: test
1603
+ type: nq
1604
+ metrics:
1605
+ - type: map_at_1
1606
+ value: 31.698999999999998
1607
+ - type: map_at_10
1608
+ value: 46.556
1609
+ - type: map_at_100
1610
+ value: 47.652
1611
+ - type: map_at_1000
1612
+ value: 47.68
1613
+ - type: map_at_3
1614
+ value: 42.492000000000004
1615
+ - type: map_at_5
1616
+ value: 44.763999999999996
1617
+ - type: mrr_at_1
1618
+ value: 35.747
1619
+ - type: mrr_at_10
1620
+ value: 49.242999999999995
1621
+ - type: mrr_at_100
1622
+ value: 50.052
1623
+ - type: mrr_at_1000
1624
+ value: 50.068
1625
+ - type: mrr_at_3
1626
+ value: 45.867000000000004
1627
+ - type: mrr_at_5
1628
+ value: 47.778999999999996
1629
+ - type: ndcg_at_1
1630
+ value: 35.717999999999996
1631
+ - type: ndcg_at_10
1632
+ value: 54.14600000000001
1633
+ - type: ndcg_at_100
1634
+ value: 58.672999999999995
1635
+ - type: ndcg_at_1000
1636
+ value: 59.279
1637
+ - type: ndcg_at_3
1638
+ value: 46.407
1639
+ - type: ndcg_at_5
1640
+ value: 50.181
1641
+ - type: precision_at_1
1642
+ value: 35.717999999999996
1643
+ - type: precision_at_10
1644
+ value: 8.844000000000001
1645
+ - type: precision_at_100
1646
+ value: 1.139
1647
+ - type: precision_at_1000
1648
+ value: 0.12
1649
+ - type: precision_at_3
1650
+ value: 20.993000000000002
1651
+ - type: precision_at_5
1652
+ value: 14.791000000000002
1653
+ - type: recall_at_1
1654
+ value: 31.698999999999998
1655
+ - type: recall_at_10
1656
+ value: 74.693
1657
+ - type: recall_at_100
1658
+ value: 94.15299999999999
1659
+ - type: recall_at_1000
1660
+ value: 98.585
1661
+ - type: recall_at_3
1662
+ value: 54.388999999999996
1663
+ - type: recall_at_5
1664
+ value: 63.08200000000001
1665
+ task:
1666
+ type: Retrieval
1667
+ - dataset:
1668
+ config: default
1669
+ name: MTEB QuoraRetrieval
1670
+ revision: None
1671
+ split: test
1672
+ type: quora
1673
+ metrics:
1674
+ - type: map_at_1
1675
+ value: 71.283
1676
+ - type: map_at_10
1677
+ value: 85.24000000000001
1678
+ - type: map_at_100
1679
+ value: 85.882
1680
+ - type: map_at_1000
1681
+ value: 85.897
1682
+ - type: map_at_3
1683
+ value: 82.326
1684
+ - type: map_at_5
1685
+ value: 84.177
1686
+ - type: mrr_at_1
1687
+ value: 82.21000000000001
1688
+ - type: mrr_at_10
1689
+ value: 88.228
1690
+ - type: mrr_at_100
1691
+ value: 88.32
1692
+ - type: mrr_at_1000
1693
+ value: 88.32
1694
+ - type: mrr_at_3
1695
+ value: 87.323
1696
+ - type: mrr_at_5
1697
+ value: 87.94800000000001
1698
+ - type: ndcg_at_1
1699
+ value: 82.17999999999999
1700
+ - type: ndcg_at_10
1701
+ value: 88.9
1702
+ - type: ndcg_at_100
1703
+ value: 90.079
1704
+ - type: ndcg_at_1000
1705
+ value: 90.158
1706
+ - type: ndcg_at_3
1707
+ value: 86.18299999999999
1708
+ - type: ndcg_at_5
1709
+ value: 87.71799999999999
1710
+ - type: precision_at_1
1711
+ value: 82.17999999999999
1712
+ - type: precision_at_10
1713
+ value: 13.464
1714
+ - type: precision_at_100
1715
+ value: 1.533
1716
+ - type: precision_at_1000
1717
+ value: 0.157
1718
+ - type: precision_at_3
1719
+ value: 37.693
1720
+ - type: precision_at_5
1721
+ value: 24.792
1722
+ - type: recall_at_1
1723
+ value: 71.283
1724
+ - type: recall_at_10
1725
+ value: 95.742
1726
+ - type: recall_at_100
1727
+ value: 99.67200000000001
1728
+ - type: recall_at_1000
1729
+ value: 99.981
1730
+ - type: recall_at_3
1731
+ value: 87.888
1732
+ - type: recall_at_5
1733
+ value: 92.24
1734
+ task:
1735
+ type: Retrieval
1736
+ - dataset:
1737
+ config: default
1738
+ name: MTEB RedditClustering
1739
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1740
+ split: test
1741
+ type: mteb/reddit-clustering
1742
+ metrics:
1743
+ - type: v_measure
1744
+ value: 56.24267063669042
1745
+ task:
1746
+ type: Clustering
1747
+ - dataset:
1748
+ config: default
1749
+ name: MTEB RedditClusteringP2P
1750
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
1751
+ split: test
1752
+ type: mteb/reddit-clustering-p2p
1753
+ metrics:
1754
+ - type: v_measure
1755
+ value: 62.88056988932578
1756
+ task:
1757
+ type: Clustering
1758
+ - dataset:
1759
+ config: default
1760
+ name: MTEB SCIDOCS
1761
+ revision: None
1762
+ split: test
1763
+ type: scidocs
1764
+ metrics:
1765
+ - type: map_at_1
1766
+ value: 4.903
1767
+ - type: map_at_10
1768
+ value: 13.202
1769
+ - type: map_at_100
1770
+ value: 15.5
1771
+ - type: map_at_1000
1772
+ value: 15.870999999999999
1773
+ - type: map_at_3
1774
+ value: 9.407
1775
+ - type: map_at_5
1776
+ value: 11.238
1777
+ - type: mrr_at_1
1778
+ value: 24.2
1779
+ - type: mrr_at_10
1780
+ value: 35.867
1781
+ - type: mrr_at_100
1782
+ value: 37.001
1783
+ - type: mrr_at_1000
1784
+ value: 37.043
1785
+ - type: mrr_at_3
1786
+ value: 32.5
1787
+ - type: mrr_at_5
1788
+ value: 34.35
1789
+ - type: ndcg_at_1
1790
+ value: 24.2
1791
+ - type: ndcg_at_10
1792
+ value: 21.731
1793
+ - type: ndcg_at_100
1794
+ value: 30.7
1795
+ - type: ndcg_at_1000
1796
+ value: 36.618
1797
+ - type: ndcg_at_3
1798
+ value: 20.72
1799
+ - type: ndcg_at_5
1800
+ value: 17.954
1801
+ - type: precision_at_1
1802
+ value: 24.2
1803
+ - type: precision_at_10
1804
+ value: 11.33
1805
+ - type: precision_at_100
1806
+ value: 2.4410000000000003
1807
+ - type: precision_at_1000
1808
+ value: 0.386
1809
+ - type: precision_at_3
1810
+ value: 19.667
1811
+ - type: precision_at_5
1812
+ value: 15.86
1813
+ - type: recall_at_1
1814
+ value: 4.903
1815
+ - type: recall_at_10
1816
+ value: 22.962
1817
+ - type: recall_at_100
1818
+ value: 49.563
1819
+ - type: recall_at_1000
1820
+ value: 78.238
1821
+ - type: recall_at_3
1822
+ value: 11.953
1823
+ - type: recall_at_5
1824
+ value: 16.067999999999998
1825
+ task:
1826
+ type: Retrieval
1827
+ - dataset:
1828
+ config: default
1829
+ name: MTEB SICK-R
1830
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1831
+ split: test
1832
+ type: mteb/sickr-sts
1833
+ metrics:
1834
+ - type: cos_sim_pearson
1835
+ value: 84.12694254604078
1836
+ - type: cos_sim_spearman
1837
+ value: 80.30141815181918
1838
+ - type: euclidean_pearson
1839
+ value: 81.34015449877128
1840
+ - type: euclidean_spearman
1841
+ value: 80.13984197010849
1842
+ - type: manhattan_pearson
1843
+ value: 81.31767068124086
1844
+ - type: manhattan_spearman
1845
+ value: 80.11720513114103
1846
+ task:
1847
+ type: STS
1848
+ - dataset:
1849
+ config: default
1850
+ name: MTEB STS12
1851
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1852
+ split: test
1853
+ type: mteb/sts12-sts
1854
+ metrics:
1855
+ - type: cos_sim_pearson
1856
+ value: 86.13112984010417
1857
+ - type: cos_sim_spearman
1858
+ value: 78.03063573402875
1859
+ - type: euclidean_pearson
1860
+ value: 83.51928418844804
1861
+ - type: euclidean_spearman
1862
+ value: 78.4045235411144
1863
+ - type: manhattan_pearson
1864
+ value: 83.49981637388689
1865
+ - type: manhattan_spearman
1866
+ value: 78.4042575139372
1867
+ task:
1868
+ type: STS
1869
+ - dataset:
1870
+ config: default
1871
+ name: MTEB STS13
1872
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1873
+ split: test
1874
+ type: mteb/sts13-sts
1875
+ metrics:
1876
+ - type: cos_sim_pearson
1877
+ value: 82.50327987379504
1878
+ - type: cos_sim_spearman
1879
+ value: 84.18556767756205
1880
+ - type: euclidean_pearson
1881
+ value: 82.69684424327679
1882
+ - type: euclidean_spearman
1883
+ value: 83.5368106038335
1884
+ - type: manhattan_pearson
1885
+ value: 82.57967581007374
1886
+ - type: manhattan_spearman
1887
+ value: 83.43009053133697
1888
+ task:
1889
+ type: STS
1890
+ - dataset:
1891
+ config: default
1892
+ name: MTEB STS14
1893
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
1894
+ split: test
1895
+ type: mteb/sts14-sts
1896
+ metrics:
1897
+ - type: cos_sim_pearson
1898
+ value: 82.50756863007814
1899
+ - type: cos_sim_spearman
1900
+ value: 82.27204331279108
1901
+ - type: euclidean_pearson
1902
+ value: 81.39535251429741
1903
+ - type: euclidean_spearman
1904
+ value: 81.84386626336239
1905
+ - type: manhattan_pearson
1906
+ value: 81.34281737280695
1907
+ - type: manhattan_spearman
1908
+ value: 81.81149375673166
1909
+ task:
1910
+ type: STS
1911
+ - dataset:
1912
+ config: default
1913
+ name: MTEB STS15
1914
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
1915
+ split: test
1916
+ type: mteb/sts15-sts
1917
+ metrics:
1918
+ - type: cos_sim_pearson
1919
+ value: 86.8727714856726
1920
+ - type: cos_sim_spearman
1921
+ value: 87.95738287792312
1922
+ - type: euclidean_pearson
1923
+ value: 86.62920602795887
1924
+ - type: euclidean_spearman
1925
+ value: 87.05207355381243
1926
+ - type: manhattan_pearson
1927
+ value: 86.53587918472225
1928
+ - type: manhattan_spearman
1929
+ value: 86.95382961029586
1930
+ task:
1931
+ type: STS
1932
+ - dataset:
1933
+ config: default
1934
+ name: MTEB STS16
1935
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
1936
+ split: test
1937
+ type: mteb/sts16-sts
1938
+ metrics:
1939
+ - type: cos_sim_pearson
1940
+ value: 83.52240359769479
1941
+ - type: cos_sim_spearman
1942
+ value: 85.47685776238286
1943
+ - type: euclidean_pearson
1944
+ value: 84.25815333483058
1945
+ - type: euclidean_spearman
1946
+ value: 85.27415639683198
1947
+ - type: manhattan_pearson
1948
+ value: 84.29127757025637
1949
+ - type: manhattan_spearman
1950
+ value: 85.30226224917351
1951
+ task:
1952
+ type: STS
1953
+ - dataset:
1954
+ config: en-en
1955
+ name: MTEB STS17 (en-en)
1956
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
1957
+ split: test
1958
+ type: mteb/sts17-crosslingual-sts
1959
+ metrics:
1960
+ - type: cos_sim_pearson
1961
+ value: 86.42501708915708
1962
+ - type: cos_sim_spearman
1963
+ value: 86.42276182795041
1964
+ - type: euclidean_pearson
1965
+ value: 86.5408207354761
1966
+ - type: euclidean_spearman
1967
+ value: 85.46096321750838
1968
+ - type: manhattan_pearson
1969
+ value: 86.54177303026881
1970
+ - type: manhattan_spearman
1971
+ value: 85.50313151916117
1972
+ task:
1973
+ type: STS
1974
+ - dataset:
1975
+ config: en
1976
+ name: MTEB STS22 (en)
1977
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
1978
+ split: test
1979
+ type: mteb/sts22-crosslingual-sts
1980
+ metrics:
1981
+ - type: cos_sim_pearson
1982
+ value: 64.86521089250766
1983
+ - type: cos_sim_spearman
1984
+ value: 65.94868540323003
1985
+ - type: euclidean_pearson
1986
+ value: 67.16569626533084
1987
+ - type: euclidean_spearman
1988
+ value: 66.37667004134917
1989
+ - type: manhattan_pearson
1990
+ value: 67.1482365102333
1991
+ - type: manhattan_spearman
1992
+ value: 66.53240122580029
1993
+ task:
1994
+ type: STS
1995
+ - dataset:
1996
+ config: default
1997
+ name: MTEB STSBenchmark
1998
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
1999
+ split: test
2000
+ type: mteb/stsbenchmark-sts
2001
+ metrics:
2002
+ - type: cos_sim_pearson
2003
+ value: 84.64746265365318
2004
+ - type: cos_sim_spearman
2005
+ value: 86.41888825906786
2006
+ - type: euclidean_pearson
2007
+ value: 85.27453642725811
2008
+ - type: euclidean_spearman
2009
+ value: 85.94095796602544
2010
+ - type: manhattan_pearson
2011
+ value: 85.28643660505334
2012
+ - type: manhattan_spearman
2013
+ value: 85.95028003260744
2014
+ task:
2015
+ type: STS
2016
+ - dataset:
2017
+ config: default
2018
+ name: MTEB SciDocsRR
2019
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2020
+ split: test
2021
+ type: mteb/scidocs-reranking
2022
+ metrics:
2023
+ - type: map
2024
+ value: 87.48903153618527
2025
+ - type: mrr
2026
+ value: 96.41081503826601
2027
+ task:
2028
+ type: Reranking
2029
+ - dataset:
2030
+ config: default
2031
+ name: MTEB SciFact
2032
+ revision: None
2033
+ split: test
2034
+ type: scifact
2035
+ metrics:
2036
+ - type: map_at_1
2037
+ value: 58.594
2038
+ - type: map_at_10
2039
+ value: 69.296
2040
+ - type: map_at_100
2041
+ value: 69.782
2042
+ - type: map_at_1000
2043
+ value: 69.795
2044
+ - type: map_at_3
2045
+ value: 66.23
2046
+ - type: map_at_5
2047
+ value: 68.293
2048
+ - type: mrr_at_1
2049
+ value: 61.667
2050
+ - type: mrr_at_10
2051
+ value: 70.339
2052
+ - type: mrr_at_100
2053
+ value: 70.708
2054
+ - type: mrr_at_1000
2055
+ value: 70.722
2056
+ - type: mrr_at_3
2057
+ value: 68.0
2058
+ - type: mrr_at_5
2059
+ value: 69.56700000000001
2060
+ - type: ndcg_at_1
2061
+ value: 61.667
2062
+ - type: ndcg_at_10
2063
+ value: 74.039
2064
+ - type: ndcg_at_100
2065
+ value: 76.103
2066
+ - type: ndcg_at_1000
2067
+ value: 76.47800000000001
2068
+ - type: ndcg_at_3
2069
+ value: 68.967
2070
+ - type: ndcg_at_5
2071
+ value: 71.96900000000001
2072
+ - type: precision_at_1
2073
+ value: 61.667
2074
+ - type: precision_at_10
2075
+ value: 9.866999999999999
2076
+ - type: precision_at_100
2077
+ value: 1.097
2078
+ - type: precision_at_1000
2079
+ value: 0.11299999999999999
2080
+ - type: precision_at_3
2081
+ value: 27.111
2082
+ - type: precision_at_5
2083
+ value: 18.2
2084
+ - type: recall_at_1
2085
+ value: 58.594
2086
+ - type: recall_at_10
2087
+ value: 87.422
2088
+ - type: recall_at_100
2089
+ value: 96.667
2090
+ - type: recall_at_1000
2091
+ value: 99.667
2092
+ - type: recall_at_3
2093
+ value: 74.217
2094
+ - type: recall_at_5
2095
+ value: 81.539
2096
+ task:
2097
+ type: Retrieval
2098
+ - dataset:
2099
+ config: default
2100
+ name: MTEB SprintDuplicateQuestions
2101
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2102
+ split: test
2103
+ type: mteb/sprintduplicatequestions-pairclassification
2104
+ metrics:
2105
+ - type: cos_sim_accuracy
2106
+ value: 99.85049504950496
2107
+ - type: cos_sim_ap
2108
+ value: 96.33111544137081
2109
+ - type: cos_sim_f1
2110
+ value: 92.35443037974684
2111
+ - type: cos_sim_precision
2112
+ value: 93.53846153846153
2113
+ - type: cos_sim_recall
2114
+ value: 91.2
2115
+ - type: dot_accuracy
2116
+ value: 99.82376237623762
2117
+ - type: dot_ap
2118
+ value: 95.38082527310888
2119
+ - type: dot_f1
2120
+ value: 90.90909090909092
2121
+ - type: dot_precision
2122
+ value: 92.90187891440502
2123
+ - type: dot_recall
2124
+ value: 89.0
2125
+ - type: euclidean_accuracy
2126
+ value: 99.84851485148515
2127
+ - type: euclidean_ap
2128
+ value: 96.32316003996347
2129
+ - type: euclidean_f1
2130
+ value: 92.2071392659628
2131
+ - type: euclidean_precision
2132
+ value: 92.71991911021233
2133
+ - type: euclidean_recall
2134
+ value: 91.7
2135
+ - type: manhattan_accuracy
2136
+ value: 99.84851485148515
2137
+ - type: manhattan_ap
2138
+ value: 96.3655668249217
2139
+ - type: manhattan_f1
2140
+ value: 92.18356026222895
2141
+ - type: manhattan_precision
2142
+ value: 92.98067141403867
2143
+ - type: manhattan_recall
2144
+ value: 91.4
2145
+ - type: max_accuracy
2146
+ value: 99.85049504950496
2147
+ - type: max_ap
2148
+ value: 96.3655668249217
2149
+ - type: max_f1
2150
+ value: 92.35443037974684
2151
+ task:
2152
+ type: PairClassification
2153
+ - dataset:
2154
+ config: default
2155
+ name: MTEB StackExchangeClustering
2156
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2157
+ split: test
2158
+ type: mteb/stackexchange-clustering
2159
+ metrics:
2160
+ - type: v_measure
2161
+ value: 65.94861371629051
2162
+ task:
2163
+ type: Clustering
2164
+ - dataset:
2165
+ config: default
2166
+ name: MTEB StackExchangeClusteringP2P
2167
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2168
+ split: test
2169
+ type: mteb/stackexchange-clustering-p2p
2170
+ metrics:
2171
+ - type: v_measure
2172
+ value: 35.009430451385
2173
+ task:
2174
+ type: Clustering
2175
+ - dataset:
2176
+ config: default
2177
+ name: MTEB StackOverflowDupQuestions
2178
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2179
+ split: test
2180
+ type: mteb/stackoverflowdupquestions-reranking
2181
+ metrics:
2182
+ - type: map
2183
+ value: 54.61164066427969
2184
+ - type: mrr
2185
+ value: 55.49710603938544
2186
+ task:
2187
+ type: Reranking
2188
+ - dataset:
2189
+ config: default
2190
+ name: MTEB SummEval
2191
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2192
+ split: test
2193
+ type: mteb/summeval
2194
+ metrics:
2195
+ - type: cos_sim_pearson
2196
+ value: 30.622620124907662
2197
+ - type: cos_sim_spearman
2198
+ value: 31.0678351356163
2199
+ - type: dot_pearson
2200
+ value: 30.863727693306814
2201
+ - type: dot_spearman
2202
+ value: 31.230306567021255
2203
+ task:
2204
+ type: Summarization
2205
+ - dataset:
2206
+ config: default
2207
+ name: MTEB TRECCOVID
2208
+ revision: None
2209
+ split: test
2210
+ type: trec-covid
2211
+ metrics:
2212
+ - type: map_at_1
2213
+ value: 0.22
2214
+ - type: map_at_10
2215
+ value: 2.011
2216
+ - type: map_at_100
2217
+ value: 10.974
2218
+ - type: map_at_1000
2219
+ value: 25.819
2220
+ - type: map_at_3
2221
+ value: 0.6649999999999999
2222
+ - type: map_at_5
2223
+ value: 1.076
2224
+ - type: mrr_at_1
2225
+ value: 86.0
2226
+ - type: mrr_at_10
2227
+ value: 91.8
2228
+ - type: mrr_at_100
2229
+ value: 91.8
2230
+ - type: mrr_at_1000
2231
+ value: 91.8
2232
+ - type: mrr_at_3
2233
+ value: 91.0
2234
+ - type: mrr_at_5
2235
+ value: 91.8
2236
+ - type: ndcg_at_1
2237
+ value: 82.0
2238
+ - type: ndcg_at_10
2239
+ value: 78.07300000000001
2240
+ - type: ndcg_at_100
2241
+ value: 58.231
2242
+ - type: ndcg_at_1000
2243
+ value: 51.153000000000006
2244
+ - type: ndcg_at_3
2245
+ value: 81.123
2246
+ - type: ndcg_at_5
2247
+ value: 81.059
2248
+ - type: precision_at_1
2249
+ value: 86.0
2250
+ - type: precision_at_10
2251
+ value: 83.0
2252
+ - type: precision_at_100
2253
+ value: 59.38
2254
+ - type: precision_at_1000
2255
+ value: 22.55
2256
+ - type: precision_at_3
2257
+ value: 87.333
2258
+ - type: precision_at_5
2259
+ value: 86.8
2260
+ - type: recall_at_1
2261
+ value: 0.22
2262
+ - type: recall_at_10
2263
+ value: 2.2079999999999997
2264
+ - type: recall_at_100
2265
+ value: 14.069
2266
+ - type: recall_at_1000
2267
+ value: 47.678
2268
+ - type: recall_at_3
2269
+ value: 0.7040000000000001
2270
+ - type: recall_at_5
2271
+ value: 1.161
2272
+ task:
2273
+ type: Retrieval
2274
+ - dataset:
2275
+ config: default
2276
+ name: MTEB Touche2020
2277
+ revision: None
2278
+ split: test
2279
+ type: webis-touche2020
2280
+ metrics:
2281
+ - type: map_at_1
2282
+ value: 2.809
2283
+ - type: map_at_10
2284
+ value: 10.394
2285
+ - type: map_at_100
2286
+ value: 16.598
2287
+ - type: map_at_1000
2288
+ value: 18.142
2289
+ - type: map_at_3
2290
+ value: 5.572
2291
+ - type: map_at_5
2292
+ value: 7.1370000000000005
2293
+ - type: mrr_at_1
2294
+ value: 32.653
2295
+ - type: mrr_at_10
2296
+ value: 46.564
2297
+ - type: mrr_at_100
2298
+ value: 47.469
2299
+ - type: mrr_at_1000
2300
+ value: 47.469
2301
+ - type: mrr_at_3
2302
+ value: 42.177
2303
+ - type: mrr_at_5
2304
+ value: 44.524
2305
+ - type: ndcg_at_1
2306
+ value: 30.612000000000002
2307
+ - type: ndcg_at_10
2308
+ value: 25.701
2309
+ - type: ndcg_at_100
2310
+ value: 37.532
2311
+ - type: ndcg_at_1000
2312
+ value: 48.757
2313
+ - type: ndcg_at_3
2314
+ value: 28.199999999999996
2315
+ - type: ndcg_at_5
2316
+ value: 25.987
2317
+ - type: precision_at_1
2318
+ value: 32.653
2319
+ - type: precision_at_10
2320
+ value: 23.469
2321
+ - type: precision_at_100
2322
+ value: 7.9799999999999995
2323
+ - type: precision_at_1000
2324
+ value: 1.5350000000000001
2325
+ - type: precision_at_3
2326
+ value: 29.932
2327
+ - type: precision_at_5
2328
+ value: 26.122
2329
+ - type: recall_at_1
2330
+ value: 2.809
2331
+ - type: recall_at_10
2332
+ value: 16.887
2333
+ - type: recall_at_100
2334
+ value: 48.67
2335
+ - type: recall_at_1000
2336
+ value: 82.89699999999999
2337
+ - type: recall_at_3
2338
+ value: 6.521000000000001
2339
+ - type: recall_at_5
2340
+ value: 9.609
2341
+ task:
2342
+ type: Retrieval
2343
+ - dataset:
2344
+ config: default
2345
+ name: MTEB ToxicConversationsClassification
2346
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2347
+ split: test
2348
+ type: mteb/toxic_conversations_50k
2349
+ metrics:
2350
+ - type: accuracy
2351
+ value: 71.57860000000001
2352
+ - type: ap
2353
+ value: 13.82629211536393
2354
+ - type: f1
2355
+ value: 54.59860966183956
2356
+ task:
2357
+ type: Classification
2358
+ - dataset:
2359
+ config: default
2360
+ name: MTEB TweetSentimentExtractionClassification
2361
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2362
+ split: test
2363
+ type: mteb/tweet_sentiment_extraction
2364
+ metrics:
2365
+ - type: accuracy
2366
+ value: 59.38030560271647
2367
+ - type: f1
2368
+ value: 59.69685552567865
2369
+ task:
2370
+ type: Classification
2371
+ - dataset:
2372
+ config: default
2373
+ name: MTEB TwentyNewsgroupsClustering
2374
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2375
+ split: test
2376
+ type: mteb/twentynewsgroups-clustering
2377
+ metrics:
2378
+ - type: v_measure
2379
+ value: 51.4736717043405
2380
+ task:
2381
+ type: Clustering
2382
+ - dataset:
2383
+ config: default
2384
+ name: MTEB TwitterSemEval2015
2385
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2386
+ split: test
2387
+ type: mteb/twittersemeval2015-pairclassification
2388
+ metrics:
2389
+ - type: cos_sim_accuracy
2390
+ value: 86.92853311080646
2391
+ - type: cos_sim_ap
2392
+ value: 77.67872502591382
2393
+ - type: cos_sim_f1
2394
+ value: 70.33941236068895
2395
+ - type: cos_sim_precision
2396
+ value: 67.63273258645884
2397
+ - type: cos_sim_recall
2398
+ value: 73.27176781002639
2399
+ - type: dot_accuracy
2400
+ value: 85.79603027954938
2401
+ - type: dot_ap
2402
+ value: 73.73786190233379
2403
+ - type: dot_f1
2404
+ value: 67.3437901774235
2405
+ - type: dot_precision
2406
+ value: 65.67201604814443
2407
+ - type: dot_recall
2408
+ value: 69.10290237467018
2409
+ - type: euclidean_accuracy
2410
+ value: 86.94045419324074
2411
+ - type: euclidean_ap
2412
+ value: 77.6687791535167
2413
+ - type: euclidean_f1
2414
+ value: 70.47209214023542
2415
+ - type: euclidean_precision
2416
+ value: 67.7207492094381
2417
+ - type: euclidean_recall
2418
+ value: 73.45646437994723
2419
+ - type: manhattan_accuracy
2420
+ value: 86.87488823985218
2421
+ - type: manhattan_ap
2422
+ value: 77.63373392430728
2423
+ - type: manhattan_f1
2424
+ value: 70.40920716112532
2425
+ - type: manhattan_precision
2426
+ value: 68.31265508684864
2427
+ - type: manhattan_recall
2428
+ value: 72.63852242744063
2429
+ - type: max_accuracy
2430
+ value: 86.94045419324074
2431
+ - type: max_ap
2432
+ value: 77.67872502591382
2433
+ - type: max_f1
2434
+ value: 70.47209214023542
2435
+ task:
2436
+ type: PairClassification
2437
+ - dataset:
2438
+ config: default
2439
+ name: MTEB TwitterURLCorpus
2440
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2441
+ split: test
2442
+ type: mteb/twitterurlcorpus-pairclassification
2443
+ metrics:
2444
+ - type: cos_sim_accuracy
2445
+ value: 88.67155664221679
2446
+ - type: cos_sim_ap
2447
+ value: 85.64591703003417
2448
+ - type: cos_sim_f1
2449
+ value: 77.59531005352656
2450
+ - type: cos_sim_precision
2451
+ value: 73.60967184801382
2452
+ - type: cos_sim_recall
2453
+ value: 82.03726516784724
2454
+ - type: dot_accuracy
2455
+ value: 88.41541506578181
2456
+ - type: dot_ap
2457
+ value: 84.6482788957769
2458
+ - type: dot_f1
2459
+ value: 77.04748541466657
2460
+ - type: dot_precision
2461
+ value: 74.02440754931176
2462
+ - type: dot_recall
2463
+ value: 80.3279950723745
2464
+ - type: euclidean_accuracy
2465
+ value: 88.63080684596576
2466
+ - type: euclidean_ap
2467
+ value: 85.44570045321562
2468
+ - type: euclidean_f1
2469
+ value: 77.28769403336106
2470
+ - type: euclidean_precision
2471
+ value: 72.90600040958427
2472
+ - type: euclidean_recall
2473
+ value: 82.22975053895904
2474
+ - type: manhattan_accuracy
2475
+ value: 88.59393798269105
2476
+ - type: manhattan_ap
2477
+ value: 85.40271361038187
2478
+ - type: manhattan_f1
2479
+ value: 77.17606419344392
2480
+ - type: manhattan_precision
2481
+ value: 72.4447747078295
2482
+ - type: manhattan_recall
2483
+ value: 82.5685247921158
2484
+ - type: max_accuracy
2485
+ value: 88.67155664221679
2486
+ - type: max_ap
2487
+ value: 85.64591703003417
2488
+ - type: max_f1
2489
+ value: 77.59531005352656
2490
+ task:
2491
+ type: PairClassification
2492
+ tags:
2493
+ - sentence-transformers
2494
+ - feature-extraction
2495
+ - sentence-similarity
2496
+ - transformers
2497
+ - mteb
2498
+ - onnx
2499
+ - teradata
2500
+
2501
+ ---
2502
+ # A Teradata Vantage compatible Embeddings Model
2503
+
2504
+ # BAAI/bge-base-en-v1.5
2505
+
2506
+ ## Overview of this Model
2507
+
2508
+ An Embedding Model which maps text (sentence/ paragraphs) into a vector. The [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model well known for its effectiveness in capturing semantic meanings in text data. It's a state-of-the-art model trained on a large corpus, capable of generating high-quality text embeddings.
2509
+
2510
+ - 109.48M params (Sizes in ONNX format - "fp32": 415.72MB, "int8": 104.75MB, "uint8": 104.75MB)
2511
+ - 512 maximum input tokens
2512
+ - 768 dimensions of output vector
2513
+ - Licence: mit. The released models can be used for commercial purposes free of charge.
2514
+ - Reference to Original Model: https://huggingface.co/BAAI/bge-base-en-v1.5
2515
+
2516
+
2517
+ ## Quickstart: Deploying this Model in Teradata Vantage
2518
+
2519
+ We have pre-converted the model into the ONNX format compatible with BYOM 6.0, eliminating the need for manual conversion.
2520
+
2521
+ **Note:** Ensure you have access to a Teradata Database with BYOM 6.0 installed.
2522
+
2523
+ To get started, clone the pre-converted model directly from the Teradata HuggingFace repository.
2524
+
2525
+
2526
+ ```python
2527
+
2528
+ import teradataml as tdml
2529
+ import getpass
2530
+ from huggingface_hub import hf_hub_download
2531
+
2532
+ model_name = "bge-base-en-v1.5"
2533
+ number_dimensions_output = 768
2534
+ model_file_name = "model.onnx"
2535
+
2536
+ # Step 1: Download Model from Teradata HuggingFace Page
2537
+
2538
+ hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"onnx/{model_file_name}", local_dir="./")
2539
+ hf_hub_download(repo_id=f"Teradata/{model_name}", filename=f"tokenizer.json", local_dir="./")
2540
+
2541
+ # Step 2: Create Connection to Vantage
2542
+
2543
+ tdml.create_context(host = input('enter your hostname'),
2544
+ username=input('enter your username'),
2545
+ password = getpass.getpass("enter your password"))
2546
+
2547
+ # Step 3: Load Models into Vantage
2548
+ # a) Embedding model
2549
+ tdml.save_byom(model_id = model_name, # must be unique in the models table
2550
+ model_file = model_file_name,
2551
+ table_name = 'embeddings_models' )
2552
+ # b) Tokenizer
2553
+ tdml.save_byom(model_id = model_name, # must be unique in the models table
2554
+ model_file = 'tokenizer.json',
2555
+ table_name = 'embeddings_tokenizers')
2556
+
2557
+ # Step 4: Test ONNXEmbeddings Function
2558
+ # Note that ONNXEmbeddings expects the 'payload' column to be 'txt'.
2559
+ # If it has got a different name, just rename it in a subquery/CTE.
2560
+ input_table = "emails.emails"
2561
+ embeddings_query = f"""
2562
+ SELECT
2563
+ *
2564
+ from mldb.ONNXEmbeddings(
2565
+ on {input_table} as InputTable
2566
+ on (select * from embeddings_models where model_id = '{model_name}') as ModelTable DIMENSION
2567
+ on (select model as tokenizer from embeddings_tokenizers where model_id = '{model_name}') as TokenizerTable DIMENSION
2568
+ using
2569
+ Accumulate('id', 'txt')
2570
+ ModelOutputTensor('sentence_embedding')
2571
+ EnableMemoryCheck('false')
2572
+ OutputFormat('FLOAT32({number_dimensions_output})')
2573
+ OverwriteCachedModel('true')
2574
+ ) a
2575
+ """
2576
+ DF_embeddings = tdml.DataFrame.from_query(embeddings_query)
2577
+ DF_embeddings
2578
+ ```
2579
+
2580
+
2581
+
2582
+ ## What Can I Do with the Embeddings?
2583
+
2584
+ Teradata Vantage includes pre-built in-database functions to process embeddings further. Explore the following examples:
2585
+
2586
+ - **Semantic Clustering with TD_KMeans:** [Semantic Clustering Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Clustering_Python.ipynb)
2587
+ - **Semantic Distance with TD_VectorDistance:** [Semantic Similarity Python Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/Semantic_Similarity_Python.ipynb)
2588
+ - **RAG-Based Application with TD_VectorDistance:** [RAG and Bedrock Query PDF Notebook](https://github.com/Teradata/jupyter-demos/blob/main/UseCases/Language_Models_InVantage/RAG_and_Bedrock_QueryPDF.ipynb)
2589
+
2590
+
2591
+ ## Deep Dive into Model Conversion to ONNX
2592
+
2593
+ **The steps below outline how we converted the open-source Hugging Face model into an ONNX file compatible with the in-database ONNXEmbeddings function.**
2594
+
2595
+ You do not need to perform these steps—they are provided solely for documentation and transparency. However, they may be helpful if you wish to convert another model to the required format.
2596
+
2597
+
2598
+ ### Part 1. Importing and Converting Model using optimum
2599
+
2600
+ We start by importing the pre-trained [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) model from Hugging Face.
2601
+
2602
+ To enhance performance and ensure compatibility with various execution environments, we'll use the [Optimum](https://github.com/huggingface/optimum) utility to convert the model into the ONNX (Open Neural Network Exchange) format.
2603
+
2604
+ After conversion to ONNX, we are fixing the opset in the ONNX file for compatibility with ONNX runtime used in Teradata Vantage
2605
+
2606
+ We are generating ONNX files for multiple different precisions: fp32, int8, uint8
2607
+
2608
+ You can find the detailed conversion steps in the file [convert.py](./convert.py)
2609
+
2610
+ ### Part 2. Running the model in Python with onnxruntime & compare results
2611
+
2612
+ Once the fixes are applied, we proceed to test the correctness of the ONNX model by calculating cosine similarity between two texts using native SentenceTransformers and ONNX runtime, comparing the results.
2613
+
2614
+ If the results are identical, it confirms that the ONNX model gives the same result as the native models, validating its correctness and suitability for further use in the database.
2615
+
2616
+
2617
+ ```python
2618
+ import onnxruntime as rt
2619
+
2620
+ from sentence_transformers.util import cos_sim
2621
+ from sentence_transformers import SentenceTransformer
2622
+
2623
+ import transformers
2624
+
2625
+
2626
+ sentences_1 = 'How is the weather today?'
2627
+ sentences_2 = 'What is the current weather like today?'
2628
+
2629
+ # Calculate ONNX result
2630
+ tokenizer = transformers.AutoTokenizer.from_pretrained("BAAI/bge-base-en-v1.5")
2631
+ predef_sess = rt.InferenceSession("onnx/model.onnx")
2632
+
2633
+ enc1 = tokenizer(sentences_1)
2634
+ embeddings_1_onnx = predef_sess.run(None, {"input_ids": [enc1.input_ids],
2635
+ "attention_mask": [enc1.attention_mask]})
2636
+
2637
+ enc2 = tokenizer(sentences_2)
2638
+ embeddings_2_onnx = predef_sess.run(None, {"input_ids": [enc2.input_ids],
2639
+ "attention_mask": [enc2.attention_mask]})
2640
+
2641
+
2642
+ # Calculate embeddings with SentenceTransformer
2643
+ model = SentenceTransformer(model_id, trust_remote_code=True)
2644
+ embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True)
2645
+ embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True)
2646
+
2647
+ # Compare results
2648
+ print("Cosine similiarity for embeddings calculated with ONNX:" + str(cos_sim(embeddings_1_onnx[1][0], embeddings_2_onnx[1][0])))
2649
+ print("Cosine similiarity for embeddings calculated with SentenceTransformer:" + str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer)))
2650
+ ```
2651
+
2652
+ You can find the detailed ONNX vs. SentenceTransformer result comparison steps in the file [test_local.py](./test_local.py)
2653
+
config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_attn_implementation_autoset": true,
3
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
4
+ "architectures": [
5
+ "BertModel"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.1,
8
+ "classifier_dropout": null,
9
+ "export_model_type": "transformer",
10
+ "gradient_checkpointing": false,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 768,
14
+ "id2label": {
15
+ "0": "LABEL_0"
16
+ },
17
+ "initializer_range": 0.02,
18
+ "intermediate_size": 3072,
19
+ "label2id": {
20
+ "LABEL_0": 0
21
+ },
22
+ "layer_norm_eps": 1e-12,
23
+ "max_position_embeddings": 512,
24
+ "model_type": "bert",
25
+ "num_attention_heads": 12,
26
+ "num_hidden_layers": 12,
27
+ "pad_token_id": 0,
28
+ "position_embedding_type": "absolute",
29
+ "torch_dtype": "float32",
30
+ "transformers_version": "4.47.1",
31
+ "type_vocab_size": 2,
32
+ "use_cache": true,
33
+ "vocab_size": 30522
34
+ }
conversion_config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_id": "BAAI/bge-base-en-v1.5",
3
+ "number_of_generated_embeddings": 768,
4
+ "precision_to_filename_map": {
5
+ "fp32": "onnx/model.onnx",
6
+ "int8": "onnx/model_int8.onnx",
7
+ "uint8": "onnx/model_uint8.onnx"
8
+ },
9
+ "opset": 16,
10
+ "IR": 8
11
+ }
convert.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import shutil
4
+
5
+ from optimum.exporters.onnx import main_export
6
+ import onnx
7
+ from onnxconverter_common import float16
8
+ import onnxruntime as rt
9
+ from onnxruntime.tools.onnx_model_utils import *
10
+ from onnxruntime.quantization import quantize_dynamic, QuantType
11
+
12
+ with open('conversion_config.json') as json_file:
13
+ conversion_config = json.load(json_file)
14
+
15
+
16
+ model_id = conversion_config["model_id"]
17
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
18
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
19
+ opset = conversion_config["opset"]
20
+ IR = conversion_config["IR"]
21
+
22
+
23
+ op = onnx.OperatorSetIdProto()
24
+ op.version = opset
25
+
26
+
27
+ if not os.path.exists("onnx"):
28
+ os.makedirs("onnx")
29
+
30
+ print("Exporting the main model version")
31
+
32
+ main_export(model_name_or_path=model_id, output="./", opset=opset, trust_remote_code=True, task="feature-extraction", dtype="fp32")
33
+
34
+ if "fp32" in precision_to_filename_map:
35
+ print("Exporting the fp32 onnx file...")
36
+
37
+ shutil.copyfile('model.onnx', precision_to_filename_map["fp32"])
38
+
39
+ print("Done\n\n")
40
+
41
+ if "int8" in precision_to_filename_map:
42
+ print("Quantizing fp32 model to int8...")
43
+ quantize_dynamic("model.onnx", precision_to_filename_map["int8"], weight_type=QuantType.QInt8)
44
+ print("Done\n\n")
45
+
46
+ if "uint8" in precision_to_filename_map:
47
+ print("Quantizing fp32 model to uint8...")
48
+ quantize_dynamic("model.onnx", precision_to_filename_map["uint8"], weight_type=QuantType.QUInt8)
49
+ print("Done\n\n")
50
+
51
+ os.remove("model.onnx")
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:81839c8d8c266e14a40b2f1fe9dec105a8c5c242c1fe69933c6211dc762d8dd8
3
+ size 435913661
onnx/model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59e021271eec474fa6eb675fa0ad4634a47446fda83a0792945b880e1130a387
3
+ size 109837980
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:100874a3b224a8d15c5069e8ebd621b1bcef47626f1439ed7f8f85fad90f50a8
3
+ size 109838017
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
test_local.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import onnxruntime as rt
2
+
3
+ from sentence_transformers.util import cos_sim
4
+ from sentence_transformers import SentenceTransformer
5
+
6
+ import transformers
7
+
8
+ import gc
9
+ import json
10
+
11
+
12
+ with open('conversion_config.json') as json_file:
13
+ conversion_config = json.load(json_file)
14
+
15
+
16
+ model_id = conversion_config["model_id"]
17
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
18
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
19
+
20
+ sentences_1 = 'How is the weather today?'
21
+ sentences_2 = 'What is the current weather like today?'
22
+
23
+ print(f"Testing on cosine similiarity between sentences: \n'{sentences_1}'\n'{sentences_2}'\n\n\n")
24
+
25
+ tokenizer = transformers.AutoTokenizer.from_pretrained("./")
26
+ enc1 = tokenizer(sentences_1)
27
+ enc2 = tokenizer(sentences_2)
28
+
29
+ for precision, file_name in precision_to_filename_map.items():
30
+
31
+
32
+ onnx_session = rt.InferenceSession(file_name)
33
+ embeddings_1_onnx = onnx_session.run(None, {"input_ids": [enc1.input_ids],
34
+ "attention_mask": [enc1.attention_mask]})[1][0]
35
+
36
+ embeddings_2_onnx = onnx_session.run(None, {"input_ids": [enc2.input_ids],
37
+ "attention_mask": [enc2.attention_mask]})[1][0]
38
+
39
+ del onnx_session
40
+ gc.collect()
41
+ print(f'Cosine similiarity for ONNX model with precision "{precision}" is {str(cos_sim(embeddings_1_onnx, embeddings_2_onnx))}')
42
+
43
+
44
+
45
+
46
+ model = SentenceTransformer(model_id, trust_remote_code=True)
47
+ embeddings_1_sentence_transformer = model.encode(sentences_1, normalize_embeddings=True, trust_remote_code=True)
48
+ embeddings_2_sentence_transformer = model.encode(sentences_2, normalize_embeddings=True, trust_remote_code=True)
49
+ print('Cosine similiarity for original sentence transformer model is '+str(cos_sim(embeddings_1_sentence_transformer, embeddings_2_sentence_transformer)))
test_teradata.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import teradataml as tdml
3
+ from tabulate import tabulate
4
+
5
+ import json
6
+
7
+
8
+ with open('conversion_config.json') as json_file:
9
+ conversion_config = json.load(json_file)
10
+
11
+
12
+ model_id = conversion_config["model_id"]
13
+ number_of_generated_embeddings = conversion_config["number_of_generated_embeddings"]
14
+ precision_to_filename_map = conversion_config["precision_to_filename_map"]
15
+
16
+ host = sys.argv[1]
17
+ username = sys.argv[2]
18
+ password = sys.argv[3]
19
+
20
+ print("Setting up connection to teradata...")
21
+ tdml.create_context(host = host, username = username, password = password)
22
+ print("Done\n\n")
23
+
24
+
25
+ print("Deploying tokenizer...")
26
+ try:
27
+ tdml.db_drop_table('tokenizer_table')
28
+ except:
29
+ print("Can't drop tokenizers table - it's not existing")
30
+ tdml.save_byom('tokenizer',
31
+ 'tokenizer.json',
32
+ 'tokenizer_table')
33
+ print("Done\n\n")
34
+
35
+ print("Testing models...")
36
+ try:
37
+ tdml.db_drop_table('model_table')
38
+ except:
39
+ print("Can't drop models table - it's not existing")
40
+
41
+ for precision, file_name in precision_to_filename_map.items():
42
+ print(f"Deploying {precision} model...")
43
+ tdml.save_byom(precision,
44
+ file_name,
45
+ 'model_table')
46
+ print(f"Model {precision} is deployed\n")
47
+
48
+ print(f"Calculating embeddings with {precision} model...")
49
+ try:
50
+ tdml.db_drop_table('emails_embeddings_store')
51
+ except:
52
+ print("Can't drop embeddings table - it's not existing")
53
+
54
+ tdml.execute_sql(f"""
55
+ create volatile table emails_embeddings_store as (
56
+ select
57
+ *
58
+ from mldb.ONNXEmbeddings(
59
+ on emails.emails as InputTable
60
+ on (select * from model_table where model_id = '{precision}') as ModelTable DIMENSION
61
+ on (select model as tokenizer from tokenizer_table where model_id = 'tokenizer') as TokenizerTable DIMENSION
62
+
63
+ using
64
+ Accumulate('id', 'txt')
65
+ ModelOutputTensor('sentence_embedding')
66
+ EnableMemoryCheck('false')
67
+ OutputFormat('FLOAT32({number_of_generated_embeddings})')
68
+ OverwriteCachedModel('true')
69
+ ) a
70
+ ) with data on commit preserve rows
71
+
72
+ """)
73
+ print("Embeddings calculated")
74
+ print(f"Testing semantic search with cosine similiarity on the output of the model with precision '{precision}'...")
75
+ tdf_embeddings_store = tdml.DataFrame('emails_embeddings_store')
76
+ tdf_embeddings_store_tgt = tdf_embeddings_store[tdf_embeddings_store.id == 3]
77
+
78
+ tdf_embeddings_store_ref = tdf_embeddings_store[tdf_embeddings_store.id != 3]
79
+
80
+ cos_sim_pd = tdml.DataFrame.from_query(f"""
81
+ SELECT
82
+ dt.target_id,
83
+ dt.reference_id,
84
+ e_tgt.txt as target_txt,
85
+ e_ref.txt as reference_txt,
86
+ (1.0 - dt.distance) as similiarity
87
+ FROM
88
+ TD_VECTORDISTANCE (
89
+ ON ({tdf_embeddings_store_tgt.show_query()}) AS TargetTable
90
+ ON ({tdf_embeddings_store_ref.show_query()}) AS ReferenceTable DIMENSION
91
+ USING
92
+ TargetIDColumn('id')
93
+ TargetFeatureColumns('[emb_0:emb_{number_of_generated_embeddings - 1}]')
94
+ RefIDColumn('id')
95
+ RefFeatureColumns('[emb_0:emb_{number_of_generated_embeddings - 1}]')
96
+ DistanceMeasure('cosine')
97
+ topk(3)
98
+ ) AS dt
99
+ JOIN emails.emails e_tgt on e_tgt.id = dt.target_id
100
+ JOIN emails.emails e_ref on e_ref.id = dt.reference_id;
101
+ """).to_pandas()
102
+ print(tabulate(cos_sim_pd, headers='keys', tablefmt='fancy_grid'))
103
+ print("Done\n\n")
104
+
105
+
106
+ tdml.remove_context()
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff