elsayovita commited on
Commit
4b04516
1 Parent(s): f8797d9

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,812 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-small-en-v1.5
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:11863
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: In the fiscal year 2022, the emissions were categorized into different
35
+ scopes, with each scope representing a specific source of emissions
36
+ sentences:
37
+ - 'Question: What is NetLink proactive in identifying to be more efficient in? '
38
+ - What standard is the Environment, Health, and Safety Management System (EHSMS)
39
+ audited to by a third-party accredited certification body at the operational assets
40
+ level of CLI?
41
+ - What do the different scopes represent in terms of emissions in the fiscal year
42
+ 2022?
43
+ - source_sentence: NetLink is committed to protecting the security of all information
44
+ and information systems, including both end-user data and corporate data. To this
45
+ end, management ensures that the appropriate IT policies, personal data protection
46
+ policy, risk mitigation strategies, cyber security programmes, systems, processes,
47
+ and controls are in place to protect our IT systems and confidential data
48
+ sentences:
49
+ - '"What recognition did NetLink receive in FY22?"'
50
+ - What measures does NetLink have in place to protect the security of all information
51
+ and information systems, including end-user data and corporate data?
52
+ - 'Question: What does Disclosure 102-10 discuss regarding the organization and
53
+ its supply chain?'
54
+ - source_sentence: In the domain of economic performance, the focus is on the financial
55
+ health and growth of the organization, ensuring sustainable profitability and
56
+ value creation for stakeholders
57
+ sentences:
58
+ - What does NetLink prioritize by investing in its network to ensure reliability
59
+ and quality of infrastructure?
60
+ - What percentage of the total energy was accounted for by heat, steam, and chilled
61
+ water in 2021 according to the given information?
62
+ - What is the focus in the domain of economic performance, ensuring sustainable
63
+ profitability and value creation for stakeholders?
64
+ - source_sentence: Disclosure 102-41 discusses collective bargaining agreements and
65
+ is found on page 98
66
+ sentences:
67
+ - What topic is discussed in Disclosure 102-41 on page 98 of the document?
68
+ - What was the number of cases in 2021, following a decrease from 42 cases in 2020?
69
+ - What type of data does GRI 101 provide in relation to connecting the nation?
70
+ - source_sentence: Employee health and well-being has never been more topical than
71
+ it was in the past year. We understand that people around the world, including
72
+ our employees, have been increasingly exposed to factors affecting their physical
73
+ and mental wellbeing. We are committed to creating an environment that supports
74
+ our employees and ensures they feel valued and have a sense of belonging. We utilised
75
+ sentences:
76
+ - What aspect of the standard covers the evaluation of the management approach?
77
+ - 'Question: What is the company''s commitment towards its employees'' health and
78
+ well-being based on the provided context information?'
79
+ - What types of skills does NetLink focus on developing through their training and
80
+ development opportunities for employees?
81
+ model-index:
82
+ - name: BAAI BGE small en v1.5 ESG
83
+ results:
84
+ - task:
85
+ type: information-retrieval
86
+ name: Information Retrieval
87
+ dataset:
88
+ name: dim 384
89
+ type: dim_384
90
+ metrics:
91
+ - type: cosine_accuracy@1
92
+ value: 0.7661637022675546
93
+ name: Cosine Accuracy@1
94
+ - type: cosine_accuracy@3
95
+ value: 0.9170530220011801
96
+ name: Cosine Accuracy@3
97
+ - type: cosine_accuracy@5
98
+ value: 0.9370311051167496
99
+ name: Cosine Accuracy@5
100
+ - type: cosine_accuracy@10
101
+ value: 0.9542274298238219
102
+ name: Cosine Accuracy@10
103
+ - type: cosine_precision@1
104
+ value: 0.7661637022675546
105
+ name: Cosine Precision@1
106
+ - type: cosine_precision@3
107
+ value: 0.30568434066706
108
+ name: Cosine Precision@3
109
+ - type: cosine_precision@5
110
+ value: 0.18740622102334994
111
+ name: Cosine Precision@5
112
+ - type: cosine_precision@10
113
+ value: 0.09542274298238222
114
+ name: Cosine Precision@10
115
+ - type: cosine_recall@1
116
+ value: 0.021282325062987634
117
+ name: Cosine Recall@1
118
+ - type: cosine_recall@3
119
+ value: 0.025473695055588344
120
+ name: Cosine Recall@3
121
+ - type: cosine_recall@5
122
+ value: 0.026028641808798603
123
+ name: Cosine Recall@5
124
+ - type: cosine_recall@10
125
+ value: 0.026506317495106176
126
+ name: Cosine Recall@10
127
+ - type: cosine_ndcg@10
128
+ value: 0.19177581579273692
129
+ name: Cosine Ndcg@10
130
+ - type: cosine_mrr@10
131
+ value: 0.843606136995247
132
+ name: Cosine Mrr@10
133
+ - type: cosine_map@100
134
+ value: 0.023463069757038203
135
+ name: Cosine Map@100
136
+ - task:
137
+ type: information-retrieval
138
+ name: Information Retrieval
139
+ dataset:
140
+ name: dim 256
141
+ type: dim_256
142
+ metrics:
143
+ - type: cosine_accuracy@1
144
+ value: 0.7621175082188316
145
+ name: Cosine Accuracy@1
146
+ - type: cosine_accuracy@3
147
+ value: 0.9118266880215797
148
+ name: Cosine Accuracy@3
149
+ - type: cosine_accuracy@5
150
+ value: 0.9353451909297816
151
+ name: Cosine Accuracy@5
152
+ - type: cosine_accuracy@10
153
+ value: 0.9527944027648992
154
+ name: Cosine Accuracy@10
155
+ - type: cosine_precision@1
156
+ value: 0.7621175082188316
157
+ name: Cosine Precision@1
158
+ - type: cosine_precision@3
159
+ value: 0.3039422293405265
160
+ name: Cosine Precision@3
161
+ - type: cosine_precision@5
162
+ value: 0.18706903818595635
163
+ name: Cosine Precision@5
164
+ - type: cosine_precision@10
165
+ value: 0.09527944027648994
166
+ name: Cosine Precision@10
167
+ - type: cosine_recall@1
168
+ value: 0.02116993078385644
169
+ name: Cosine Recall@1
170
+ - type: cosine_recall@3
171
+ value: 0.025328519111710558
172
+ name: Cosine Recall@3
173
+ - type: cosine_recall@5
174
+ value: 0.025981810859160608
175
+ name: Cosine Recall@5
176
+ - type: cosine_recall@10
177
+ value: 0.026466511187913874
178
+ name: Cosine Recall@10
179
+ - type: cosine_ndcg@10
180
+ value: 0.19114210787645763
181
+ name: Cosine Ndcg@10
182
+ - type: cosine_mrr@10
183
+ value: 0.8402866254821924
184
+ name: Cosine Mrr@10
185
+ - type: cosine_map@100
186
+ value: 0.023374206451884923
187
+ name: Cosine Map@100
188
+ - task:
189
+ type: information-retrieval
190
+ name: Information Retrieval
191
+ dataset:
192
+ name: dim 128
193
+ type: dim_128
194
+ metrics:
195
+ - type: cosine_accuracy@1
196
+ value: 0.7469442805361207
197
+ name: Cosine Accuracy@1
198
+ - type: cosine_accuracy@3
199
+ value: 0.898423670235185
200
+ name: Cosine Accuracy@3
201
+ - type: cosine_accuracy@5
202
+ value: 0.9232066087836129
203
+ name: Cosine Accuracy@5
204
+ - type: cosine_accuracy@10
205
+ value: 0.9444491275394082
206
+ name: Cosine Accuracy@10
207
+ - type: cosine_precision@1
208
+ value: 0.7469442805361207
209
+ name: Cosine Precision@1
210
+ - type: cosine_precision@3
211
+ value: 0.2994745567450616
212
+ name: Cosine Precision@3
213
+ - type: cosine_precision@5
214
+ value: 0.1846413217567226
215
+ name: Cosine Precision@5
216
+ - type: cosine_precision@10
217
+ value: 0.09444491275394083
218
+ name: Cosine Precision@10
219
+ - type: cosine_recall@1
220
+ value: 0.020748452237114468
221
+ name: Cosine Recall@1
222
+ - type: cosine_recall@3
223
+ value: 0.02495621306208848
224
+ name: Cosine Recall@3
225
+ - type: cosine_recall@5
226
+ value: 0.025644628021767035
227
+ name: Cosine Recall@5
228
+ - type: cosine_recall@10
229
+ value: 0.02623469798720579
230
+ name: Cosine Recall@10
231
+ - type: cosine_ndcg@10
232
+ value: 0.1883811701569402
233
+ name: Cosine Ndcg@10
234
+ - type: cosine_mrr@10
235
+ value: 0.8264706590720244
236
+ name: Cosine Mrr@10
237
+ - type: cosine_map@100
238
+ value: 0.02300099952981619
239
+ name: Cosine Map@100
240
+ - task:
241
+ type: information-retrieval
242
+ name: Information Retrieval
243
+ dataset:
244
+ name: dim 64
245
+ type: dim_64
246
+ metrics:
247
+ - type: cosine_accuracy@1
248
+ value: 0.7106128298069628
249
+ name: Cosine Accuracy@1
250
+ - type: cosine_accuracy@3
251
+ value: 0.8668970749388856
252
+ name: Cosine Accuracy@3
253
+ - type: cosine_accuracy@5
254
+ value: 0.8978336002697462
255
+ name: Cosine Accuracy@5
256
+ - type: cosine_accuracy@10
257
+ value: 0.9243867487144904
258
+ name: Cosine Accuracy@10
259
+ - type: cosine_precision@1
260
+ value: 0.7106128298069628
261
+ name: Cosine Precision@1
262
+ - type: cosine_precision@3
263
+ value: 0.28896569164629515
264
+ name: Cosine Precision@3
265
+ - type: cosine_precision@5
266
+ value: 0.17956672005394925
267
+ name: Cosine Precision@5
268
+ - type: cosine_precision@10
269
+ value: 0.09243867487144905
270
+ name: Cosine Precision@10
271
+ - type: cosine_recall@1
272
+ value: 0.01973924527241564
273
+ name: Cosine Recall@1
274
+ - type: cosine_recall@3
275
+ value: 0.02408047430385794
276
+ name: Cosine Recall@3
277
+ - type: cosine_recall@5
278
+ value: 0.02493982222971518
279
+ name: Cosine Recall@5
280
+ - type: cosine_recall@10
281
+ value: 0.02567740968651363
282
+ name: Cosine Recall@10
283
+ - type: cosine_ndcg@10
284
+ value: 0.1818069773338387
285
+ name: Cosine Ndcg@10
286
+ - type: cosine_mrr@10
287
+ value: 0.7936283816963235
288
+ name: Cosine Mrr@10
289
+ - type: cosine_map@100
290
+ value: 0.022106633007589808
291
+ name: Cosine Map@100
292
+ - task:
293
+ type: information-retrieval
294
+ name: Information Retrieval
295
+ dataset:
296
+ name: dim 32
297
+ type: dim_32
298
+ metrics:
299
+ - type: cosine_accuracy@1
300
+ value: 0.6166231138835033
301
+ name: Cosine Accuracy@1
302
+ - type: cosine_accuracy@3
303
+ value: 0.7788923543791622
304
+ name: Cosine Accuracy@3
305
+ - type: cosine_accuracy@5
306
+ value: 0.8194385905757396
307
+ name: Cosine Accuracy@5
308
+ - type: cosine_accuracy@10
309
+ value: 0.8608277838658013
310
+ name: Cosine Accuracy@10
311
+ - type: cosine_precision@1
312
+ value: 0.6166231138835033
313
+ name: Cosine Precision@1
314
+ - type: cosine_precision@3
315
+ value: 0.259630784793054
316
+ name: Cosine Precision@3
317
+ - type: cosine_precision@5
318
+ value: 0.16388771811514793
319
+ name: Cosine Precision@5
320
+ - type: cosine_precision@10
321
+ value: 0.08608277838658013
322
+ name: Cosine Precision@10
323
+ - type: cosine_recall@1
324
+ value: 0.017128419830097316
325
+ name: Cosine Recall@1
326
+ - type: cosine_recall@3
327
+ value: 0.02163589873275451
328
+ name: Cosine Recall@3
329
+ - type: cosine_recall@5
330
+ value: 0.022762183071548335
331
+ name: Cosine Recall@5
332
+ - type: cosine_recall@10
333
+ value: 0.02391188288516115
334
+ name: Cosine Recall@10
335
+ - type: cosine_ndcg@10
336
+ value: 0.16371507022328244
337
+ name: Cosine Ndcg@10
338
+ - type: cosine_mrr@10
339
+ value: 0.7058398528705336
340
+ name: Cosine Mrr@10
341
+ - type: cosine_map@100
342
+ value: 0.019714839230632157
343
+ name: Cosine Map@100
344
+ ---
345
+
346
+ # BAAI BGE small en v1.5 ESG
347
+
348
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
349
+
350
+ ## Model Details
351
+
352
+ ### Model Description
353
+ - **Model Type:** Sentence Transformer
354
+ - **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
355
+ - **Maximum Sequence Length:** 512 tokens
356
+ - **Output Dimensionality:** 384 tokens
357
+ - **Similarity Function:** Cosine Similarity
358
+ <!-- - **Training Dataset:** Unknown -->
359
+ - **Language:** en
360
+ - **License:** apache-2.0
361
+
362
+ ### Model Sources
363
+
364
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
365
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
366
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
367
+
368
+ ### Full Model Architecture
369
+
370
+ ```
371
+ SentenceTransformer(
372
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
373
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
374
+ (2): Normalize()
375
+ )
376
+ ```
377
+
378
+ ## Usage
379
+
380
+ ### Direct Usage (Sentence Transformers)
381
+
382
+ First install the Sentence Transformers library:
383
+
384
+ ```bash
385
+ pip install -U sentence-transformers
386
+ ```
387
+
388
+ Then you can load this model and run inference.
389
+ ```python
390
+ from sentence_transformers import SentenceTransformer
391
+
392
+ # Download from the 🤗 Hub
393
+ model = SentenceTransformer("elsayovita/bge-small-en-v1.5-esg")
394
+ # Run inference
395
+ sentences = [
396
+ 'Employee health and well-being has never been more topical than it was in the past year. We understand that people around the world, including our employees, have been increasingly exposed to factors affecting their physical and mental wellbeing. We are committed to creating an environment that supports our employees and ensures they feel valued and have a sense of belonging. We utilised',
397
+ "Question: What is the company's commitment towards its employees' health and well-being based on the provided context information?",
398
+ 'What types of skills does NetLink focus on developing through their training and development opportunities for employees?',
399
+ ]
400
+ embeddings = model.encode(sentences)
401
+ print(embeddings.shape)
402
+ # [3, 384]
403
+
404
+ # Get the similarity scores for the embeddings
405
+ similarities = model.similarity(embeddings, embeddings)
406
+ print(similarities.shape)
407
+ # [3, 3]
408
+ ```
409
+
410
+ <!--
411
+ ### Direct Usage (Transformers)
412
+
413
+ <details><summary>Click to see the direct usage in Transformers</summary>
414
+
415
+ </details>
416
+ -->
417
+
418
+ <!--
419
+ ### Downstream Usage (Sentence Transformers)
420
+
421
+ You can finetune this model on your own dataset.
422
+
423
+ <details><summary>Click to expand</summary>
424
+
425
+ </details>
426
+ -->
427
+
428
+ <!--
429
+ ### Out-of-Scope Use
430
+
431
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
432
+ -->
433
+
434
+ ## Evaluation
435
+
436
+ ### Metrics
437
+
438
+ #### Information Retrieval
439
+ * Dataset: `dim_384`
440
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
441
+
442
+ | Metric | Value |
443
+ |:--------------------|:-----------|
444
+ | cosine_accuracy@1 | 0.7662 |
445
+ | cosine_accuracy@3 | 0.9171 |
446
+ | cosine_accuracy@5 | 0.937 |
447
+ | cosine_accuracy@10 | 0.9542 |
448
+ | cosine_precision@1 | 0.7662 |
449
+ | cosine_precision@3 | 0.3057 |
450
+ | cosine_precision@5 | 0.1874 |
451
+ | cosine_precision@10 | 0.0954 |
452
+ | cosine_recall@1 | 0.0213 |
453
+ | cosine_recall@3 | 0.0255 |
454
+ | cosine_recall@5 | 0.026 |
455
+ | cosine_recall@10 | 0.0265 |
456
+ | cosine_ndcg@10 | 0.1918 |
457
+ | cosine_mrr@10 | 0.8436 |
458
+ | **cosine_map@100** | **0.0235** |
459
+
460
+ #### Information Retrieval
461
+ * Dataset: `dim_256`
462
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
463
+
464
+ | Metric | Value |
465
+ |:--------------------|:-----------|
466
+ | cosine_accuracy@1 | 0.7621 |
467
+ | cosine_accuracy@3 | 0.9118 |
468
+ | cosine_accuracy@5 | 0.9353 |
469
+ | cosine_accuracy@10 | 0.9528 |
470
+ | cosine_precision@1 | 0.7621 |
471
+ | cosine_precision@3 | 0.3039 |
472
+ | cosine_precision@5 | 0.1871 |
473
+ | cosine_precision@10 | 0.0953 |
474
+ | cosine_recall@1 | 0.0212 |
475
+ | cosine_recall@3 | 0.0253 |
476
+ | cosine_recall@5 | 0.026 |
477
+ | cosine_recall@10 | 0.0265 |
478
+ | cosine_ndcg@10 | 0.1911 |
479
+ | cosine_mrr@10 | 0.8403 |
480
+ | **cosine_map@100** | **0.0234** |
481
+
482
+ #### Information Retrieval
483
+ * Dataset: `dim_128`
484
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
485
+
486
+ | Metric | Value |
487
+ |:--------------------|:----------|
488
+ | cosine_accuracy@1 | 0.7469 |
489
+ | cosine_accuracy@3 | 0.8984 |
490
+ | cosine_accuracy@5 | 0.9232 |
491
+ | cosine_accuracy@10 | 0.9444 |
492
+ | cosine_precision@1 | 0.7469 |
493
+ | cosine_precision@3 | 0.2995 |
494
+ | cosine_precision@5 | 0.1846 |
495
+ | cosine_precision@10 | 0.0944 |
496
+ | cosine_recall@1 | 0.0207 |
497
+ | cosine_recall@3 | 0.025 |
498
+ | cosine_recall@5 | 0.0256 |
499
+ | cosine_recall@10 | 0.0262 |
500
+ | cosine_ndcg@10 | 0.1884 |
501
+ | cosine_mrr@10 | 0.8265 |
502
+ | **cosine_map@100** | **0.023** |
503
+
504
+ #### Information Retrieval
505
+ * Dataset: `dim_64`
506
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
507
+
508
+ | Metric | Value |
509
+ |:--------------------|:-----------|
510
+ | cosine_accuracy@1 | 0.7106 |
511
+ | cosine_accuracy@3 | 0.8669 |
512
+ | cosine_accuracy@5 | 0.8978 |
513
+ | cosine_accuracy@10 | 0.9244 |
514
+ | cosine_precision@1 | 0.7106 |
515
+ | cosine_precision@3 | 0.289 |
516
+ | cosine_precision@5 | 0.1796 |
517
+ | cosine_precision@10 | 0.0924 |
518
+ | cosine_recall@1 | 0.0197 |
519
+ | cosine_recall@3 | 0.0241 |
520
+ | cosine_recall@5 | 0.0249 |
521
+ | cosine_recall@10 | 0.0257 |
522
+ | cosine_ndcg@10 | 0.1818 |
523
+ | cosine_mrr@10 | 0.7936 |
524
+ | **cosine_map@100** | **0.0221** |
525
+
526
+ #### Information Retrieval
527
+ * Dataset: `dim_32`
528
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
529
+
530
+ | Metric | Value |
531
+ |:--------------------|:-----------|
532
+ | cosine_accuracy@1 | 0.6166 |
533
+ | cosine_accuracy@3 | 0.7789 |
534
+ | cosine_accuracy@5 | 0.8194 |
535
+ | cosine_accuracy@10 | 0.8608 |
536
+ | cosine_precision@1 | 0.6166 |
537
+ | cosine_precision@3 | 0.2596 |
538
+ | cosine_precision@5 | 0.1639 |
539
+ | cosine_precision@10 | 0.0861 |
540
+ | cosine_recall@1 | 0.0171 |
541
+ | cosine_recall@3 | 0.0216 |
542
+ | cosine_recall@5 | 0.0228 |
543
+ | cosine_recall@10 | 0.0239 |
544
+ | cosine_ndcg@10 | 0.1637 |
545
+ | cosine_mrr@10 | 0.7058 |
546
+ | **cosine_map@100** | **0.0197** |
547
+
548
+ <!--
549
+ ## Bias, Risks and Limitations
550
+
551
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
552
+ -->
553
+
554
+ <!--
555
+ ### Recommendations
556
+
557
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
558
+ -->
559
+
560
+ ## Training Details
561
+
562
+ ### Training Dataset
563
+
564
+ #### Unnamed Dataset
565
+
566
+
567
+ * Size: 11,863 training samples
568
+ * Columns: <code>context</code> and <code>question</code>
569
+ * Approximate statistics based on the first 1000 samples:
570
+ | | context | question |
571
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
572
+ | type | string | string |
573
+ | details | <ul><li>min: 13 tokens</li><li>mean: 40.74 tokens</li><li>max: 277 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 24.4 tokens</li><li>max: 62 tokens</li></ul> |
574
+ * Samples:
575
+ | context | question |
576
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|
577
+ | <code>The engagement with key stakeholders involves various topics and methods throughout the year</code> | <code>Question: What does the engagement with key stakeholders involve throughout the year?</code> |
578
+ | <code>For unitholders and analysts, the focus is on business and operations, the release of financial results, and the overall performance and announcements</code> | <code>Question: What is the focus for unitholders and analysts in terms of business and operations, financial results, performance, and announcements?</code> |
579
+ | <code>These are communicated through press releases and other required disclosures via SGXNet and NetLink's website</code> | <code>What platform is used to communicate press releases and required disclosures for NetLink?</code> |
580
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
581
+ ```json
582
+ {
583
+ "loss": "MultipleNegativesRankingLoss",
584
+ "matryoshka_dims": [
585
+ 384,
586
+ 256,
587
+ 128,
588
+ 64,
589
+ 32
590
+ ],
591
+ "matryoshka_weights": [
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1,
596
+ 1
597
+ ],
598
+ "n_dims_per_step": -1
599
+ }
600
+ ```
601
+
602
+ ### Training Hyperparameters
603
+ #### Non-Default Hyperparameters
604
+
605
+ - `eval_strategy`: epoch
606
+ - `per_device_train_batch_size`: 32
607
+ - `per_device_eval_batch_size`: 16
608
+ - `gradient_accumulation_steps`: 16
609
+ - `learning_rate`: 2e-05
610
+ - `num_train_epochs`: 2
611
+ - `lr_scheduler_type`: cosine
612
+ - `warmup_ratio`: 0.1
613
+ - `bf16`: True
614
+ - `tf32`: True
615
+ - `load_best_model_at_end`: True
616
+ - `optim`: adamw_torch_fused
617
+ - `batch_sampler`: no_duplicates
618
+
619
+ #### All Hyperparameters
620
+ <details><summary>Click to expand</summary>
621
+
622
+ - `overwrite_output_dir`: False
623
+ - `do_predict`: False
624
+ - `eval_strategy`: epoch
625
+ - `prediction_loss_only`: True
626
+ - `per_device_train_batch_size`: 32
627
+ - `per_device_eval_batch_size`: 16
628
+ - `per_gpu_train_batch_size`: None
629
+ - `per_gpu_eval_batch_size`: None
630
+ - `gradient_accumulation_steps`: 16
631
+ - `eval_accumulation_steps`: None
632
+ - `learning_rate`: 2e-05
633
+ - `weight_decay`: 0.0
634
+ - `adam_beta1`: 0.9
635
+ - `adam_beta2`: 0.999
636
+ - `adam_epsilon`: 1e-08
637
+ - `max_grad_norm`: 1.0
638
+ - `num_train_epochs`: 2
639
+ - `max_steps`: -1
640
+ - `lr_scheduler_type`: cosine
641
+ - `lr_scheduler_kwargs`: {}
642
+ - `warmup_ratio`: 0.1
643
+ - `warmup_steps`: 0
644
+ - `log_level`: passive
645
+ - `log_level_replica`: warning
646
+ - `log_on_each_node`: True
647
+ - `logging_nan_inf_filter`: True
648
+ - `save_safetensors`: True
649
+ - `save_on_each_node`: False
650
+ - `save_only_model`: False
651
+ - `restore_callback_states_from_checkpoint`: False
652
+ - `no_cuda`: False
653
+ - `use_cpu`: False
654
+ - `use_mps_device`: False
655
+ - `seed`: 42
656
+ - `data_seed`: None
657
+ - `jit_mode_eval`: False
658
+ - `use_ipex`: False
659
+ - `bf16`: True
660
+ - `fp16`: False
661
+ - `fp16_opt_level`: O1
662
+ - `half_precision_backend`: auto
663
+ - `bf16_full_eval`: False
664
+ - `fp16_full_eval`: False
665
+ - `tf32`: True
666
+ - `local_rank`: 0
667
+ - `ddp_backend`: None
668
+ - `tpu_num_cores`: None
669
+ - `tpu_metrics_debug`: False
670
+ - `debug`: []
671
+ - `dataloader_drop_last`: False
672
+ - `dataloader_num_workers`: 0
673
+ - `dataloader_prefetch_factor`: None
674
+ - `past_index`: -1
675
+ - `disable_tqdm`: False
676
+ - `remove_unused_columns`: True
677
+ - `label_names`: None
678
+ - `load_best_model_at_end`: True
679
+ - `ignore_data_skip`: False
680
+ - `fsdp`: []
681
+ - `fsdp_min_num_params`: 0
682
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
683
+ - `fsdp_transformer_layer_cls_to_wrap`: None
684
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
685
+ - `deepspeed`: None
686
+ - `label_smoothing_factor`: 0.0
687
+ - `optim`: adamw_torch_fused
688
+ - `optim_args`: None
689
+ - `adafactor`: False
690
+ - `group_by_length`: False
691
+ - `length_column_name`: length
692
+ - `ddp_find_unused_parameters`: None
693
+ - `ddp_bucket_cap_mb`: None
694
+ - `ddp_broadcast_buffers`: False
695
+ - `dataloader_pin_memory`: True
696
+ - `dataloader_persistent_workers`: False
697
+ - `skip_memory_metrics`: True
698
+ - `use_legacy_prediction_loop`: False
699
+ - `push_to_hub`: False
700
+ - `resume_from_checkpoint`: None
701
+ - `hub_model_id`: None
702
+ - `hub_strategy`: every_save
703
+ - `hub_private_repo`: False
704
+ - `hub_always_push`: False
705
+ - `gradient_checkpointing`: False
706
+ - `gradient_checkpointing_kwargs`: None
707
+ - `include_inputs_for_metrics`: False
708
+ - `eval_do_concat_batches`: True
709
+ - `fp16_backend`: auto
710
+ - `push_to_hub_model_id`: None
711
+ - `push_to_hub_organization`: None
712
+ - `mp_parameters`:
713
+ - `auto_find_batch_size`: False
714
+ - `full_determinism`: False
715
+ - `torchdynamo`: None
716
+ - `ray_scope`: last
717
+ - `ddp_timeout`: 1800
718
+ - `torch_compile`: False
719
+ - `torch_compile_backend`: None
720
+ - `torch_compile_mode`: None
721
+ - `dispatch_batches`: None
722
+ - `split_batches`: None
723
+ - `include_tokens_per_second`: False
724
+ - `include_num_input_tokens_seen`: False
725
+ - `neftune_noise_alpha`: None
726
+ - `optim_target_modules`: None
727
+ - `batch_eval_metrics`: False
728
+ - `eval_on_start`: False
729
+ - `batch_sampler`: no_duplicates
730
+ - `multi_dataset_batch_sampler`: proportional
731
+
732
+ </details>
733
+
734
+ ### Training Logs
735
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_32_cosine_map@100 | dim_384_cosine_map@100 | dim_64_cosine_map@100 |
736
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|:---------------------:|
737
+ | 0.4313 | 10 | 4.3426 | - | - | - | - | - |
738
+ | 0.8625 | 20 | 2.7083 | - | - | - | - | - |
739
+ | 1.0350 | 24 | - | 0.0229 | 0.0233 | 0.0195 | 0.0234 | 0.0220 |
740
+ | 1.2264 | 30 | 2.6835 | - | - | - | - | - |
741
+ | 1.6577 | 40 | 2.1702 | - | - | - | - | - |
742
+ | **1.9164** | **46** | **-** | **0.023** | **0.0234** | **0.0197** | **0.0235** | **0.0221** |
743
+
744
+ * The bold row denotes the saved checkpoint.
745
+
746
+ ### Framework Versions
747
+ - Python: 3.10.12
748
+ - Sentence Transformers: 3.0.1
749
+ - Transformers: 4.42.4
750
+ - PyTorch: 2.4.0+cu121
751
+ - Accelerate: 0.32.1
752
+ - Datasets: 2.21.0
753
+ - Tokenizers: 0.19.1
754
+
755
+ ## Citation
756
+
757
+ ### BibTeX
758
+
759
+ #### Sentence Transformers
760
+ ```bibtex
761
+ @inproceedings{reimers-2019-sentence-bert,
762
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
763
+ author = "Reimers, Nils and Gurevych, Iryna",
764
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
765
+ month = "11",
766
+ year = "2019",
767
+ publisher = "Association for Computational Linguistics",
768
+ url = "https://arxiv.org/abs/1908.10084",
769
+ }
770
+ ```
771
+
772
+ #### MatryoshkaLoss
773
+ ```bibtex
774
+ @misc{kusupati2024matryoshka,
775
+ title={Matryoshka Representation Learning},
776
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
777
+ year={2024},
778
+ eprint={2205.13147},
779
+ archivePrefix={arXiv},
780
+ primaryClass={cs.LG}
781
+ }
782
+ ```
783
+
784
+ #### MultipleNegativesRankingLoss
785
+ ```bibtex
786
+ @misc{henderson2017efficient,
787
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
788
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
789
+ year={2017},
790
+ eprint={1705.00652},
791
+ archivePrefix={arXiv},
792
+ primaryClass={cs.CL}
793
+ }
794
+ ```
795
+
796
+ <!--
797
+ ## Glossary
798
+
799
+ *Clearly define terms in order to be accessible across audiences.*
800
+ -->
801
+
802
+ <!--
803
+ ## Model Card Authors
804
+
805
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
806
+ -->
807
+
808
+ <!--
809
+ ## Model Card Contact
810
+
811
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
812
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.42.4",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d832829f45b80777f463b80659db365471d4914094e1291ade092824cd23072
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff