RishuD7 commited on
Commit
cf0c8d4
·
verified ·
1 Parent(s): 286c185

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,610 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:8290
11
+ - loss:MultipleNegativesRankingLoss
12
+ base_model: BAAI/bge-base-en-v1.5
13
+ widget:
14
+ - source_sentence: 30. HOLDING OVER. If Tenant remains in possession of the Leased
15
+ Premises after expiration of the Term, or after any termination of the Lease by
16
+ Landlord without written agreement between the parties, Tenant shall be a tenant
17
+ at sufferance and such tenancy shall be subject to the provisions hereof, except
18
+ that Base Rent for said holdover period shall be one hundred twenty five percent
19
+ (125%) of the amount of Base Rent due in the last month of the Term. Nothing in
20
+ this Section shall be construed as consent by Landlord to the possession of the
21
+ Leased Premises by Tenant after the expiration of the Term or termination of the
22
+ Lease by Landlord.
23
+ sentences:
24
+ - Holding Over
25
+ - Does landlord confirm to no eminent domain on the property
26
+ - Holding Over
27
+ - source_sentence: " Lease other than as specifically provided in the Lease..\n\
28
+ \ (j) To the knowledge of Landlord and/or Assignor, there has been\
29
+ \ no casualty with respect\n to the Premises..\n (k) There does\
30
+ \ not exist any pending, or to the knowledge of Landlord, contemplated,\n \
31
+ \ condemnation or eminent domain proceedings that affect the Premises or any part\
32
+ \ thereof, and Landlord\n has received no notice, oral or written, of the\
33
+ \ intention of any governmental body or other entity to take or\n use all\
34
+ \ or any part thereof.\n\n(d) The Premises contain approximately 3,739 rentable\
35
+ \ square feet, which is 4.18% of the. rentable area of the Project. (e) There\
36
+ \ are no existing defaults on the part of Landlord or Assignor under the Lease;.\
37
+ \ neither party to the Lease has delivered any notice of default to the other;\
38
+ \ and, to the knowledge of Landlord and/or Assignor, no event has occurred that,\
39
+ \ with the giving of notice or the passage of time or both, would constitute a\
40
+ \ default under the Lease.. (f) The current monthly Base Rent payable under the\
41
+ \ Lease is $7,322.21 per month. All rent. payable under the Lease has been paid\
42
+ \ through May 31, 2020. (g) There is currently no security deposit being held\
43
+ \ by Landlord under the Lease.. (h) All improvements to the Premises required\
44
+ \ to be made by Landlord or Assignor under the Lease have been made, and any improvement\
45
+ \ allowances to be paid to Assignor have been fully paid. (i) Landlord has no\
46
+ \ option to terminate or otherwise modify the terms and conditions of the."
47
+ sentences:
48
+ - Does landlord confirm to no eminent domain on the property
49
+ - Signage Rights
50
+ - Signage Rights
51
+ - source_sentence: 17. Condemnation. Either party may terminate this Lease if any
52
+ material part of the Premises is taken or condemned for any public or quasi-public
53
+ use under Law, by eminent domain or private purchase in lieu thereof (a "Taking").
54
+ Landlord shall also have the right to terminate this Lease if there is a Taking
55
+ of any portion of the Building or Property which would have a material adverse
56
+ effect on Landlord's ability to profitably operate the remainder of the Building.
57
+ The terminating party shall provide written notice of termination to the other
58
+ party within 45 days after it first receives notice of the Taking. The termination
59
+ shall be effective as of the effective date of any order granting possession to,
60
+ or vesting legal title in, the condemning authority. If this Lease is not terminated,
61
+ Base Rent and Tenant's Pro Rata Share shall be appropriately adjusted to account
62
+ for any reduction in the square footage of the Building or Premises. All compensation
63
+ awarded for a Taking shall be the property of Landlord. The right to receive compensation
64
+ or proceeds are expressly waived by Tenant, provided, however, Tenant may file
65
+ a separate claim for Tenant's Property and Tenant's reasonable relocation expenses,
66
+ provided the filing of the claim does not diminish the amount of Landlord's award.
67
+ If only a part of the Premises is subject to a Taking and this Lease is not terminated,
68
+ Landlord, with reasonable diligence, will restore the remaining portion of the
69
+ Premises as nearly as practicable to the condition immediately prior to the Taking.
70
+ sentences:
71
+ - Eminent Domain(Detail)
72
+ - Permitted Use
73
+ - Permitted Use
74
+ - source_sentence: " 5. Lessor reserves the right to relocate all or a part of\
75
+ \ parking spaces from floor to\n floor, within one floor, and/or to reasonably\
76
+ \ adjacent offsite location(s), and to\n reasonably allocate them between\
77
+ \ compact and standard size spaces, as Iong as\n the same complies with applicable\
78
+ \ laws, ordinances and regulations..\n.\n 6. Users of the parking area will\
79
+ \ obey all posted signs and park only in the areas\n designated for vehicle\
80
+ \ parking.\n.\n 7. Unless otherwise instructed, every person using the parking\
81
+ \ area is required to\n park and lock his own vehicle. Lessor will not be\
82
+ \ responsible for any damage to\n vehicles, injury to persons or loss of\
83
+ \ property, all of which risks are assumed by\n the party using the parking\
84
+ \ area.\n.\n 8. Validation, if established, will be permissible only by such\
85
+ \ method or methods\n as Lessor and/or its licensee may establish at rates\
86
+ \ generally applicable to visitor\n parking.\n.\n 9. The maintenance,\
87
+ \ washing, waxing or cleaning of vehicles in the parking\n structure or Common\
88
+ \ Areas is prohibited.\n.\n 10. Lessee shall be responsible for seeing that\
89
+ \ all of its employees, agents and\n invitees comply with the applicable\
90
+ \ parking rules, regulations, laws and.\n agreements.\n.\n 11. Lessor\
91
+ \ reserves the right to modify these rules and/or adopt such other\n reasonable\
92
+ \ and non-discriminatory rules and regulations as it may deem.\n necessary\
93
+ \ for the proper operation of the parking area..\n.\n 12. Such parking use\
94
+ \ as is herein provided is intended merely as a license only\n and no bailment\
95
+ \ is intended or shall be created hereby..\n.\n 4/5/2012 \
96
+ \ 14 initials O\n 3. Lessor reserves the right to use\
97
+ \ parking stickers or identification devices\n which shall be the property\
98
+ \ of Lessor and be returned to Lessor by the holder\n thereof upon termination\
99
+ \ of the holder's parking privileges. Lessee will pay such\n replacement\
100
+ \ charge as is reasonably established by Lessor for the loss of such\n devices.\n\
101
+ .\n 4. Lessor reserves the right to refuse the sale of monthly identification\
102
+ \ devices to\n any person or entity that willfully refuses to comply with\
103
+ \ the applicable rules,\n regulations, laws and/or agreements.\n"
104
+ sentences:
105
+ - Does landlord confirm to no eminent domain on the property
106
+ - Permitted Use
107
+ - Right to Relocate (Detail)
108
+ - source_sentence: '31. HOLDING OVER. If Tenant remains in possession of the Leased
109
+ Premises after
110
+
111
+ expiration of the Term, or after any termination of the Lease by Landlord without
112
+ written agreement
113
+
114
+ between the parties, Tenant shall be a tenant at sufferance and such tenancy shall
115
+ be subject to the
116
+
117
+ provisions hereof, except that Rent for said holdover period shall be one hundred
118
+ fifty percent (150%) of
119
+
120
+ the amount of Rent due in the last month of the Term. Nothing in this Section
121
+ 29 shall be construed as
122
+
123
+ consent by Landlord to the possession of the Leased Premises by Tenant after the
124
+ expiration of the Term
125
+
126
+ or termination of the Lease by Landlord. '
127
+ sentences:
128
+ - Holding Over
129
+ - Does landlord confirm to no eminent domain on the property
130
+ - Holding Rent
131
+ pipeline_tag: sentence-similarity
132
+ library_name: sentence-transformers
133
+ metrics:
134
+ - cosine_accuracy@1
135
+ - cosine_accuracy@3
136
+ - cosine_accuracy@5
137
+ - cosine_accuracy@10
138
+ - cosine_precision@1
139
+ - cosine_precision@3
140
+ - cosine_precision@5
141
+ - cosine_precision@10
142
+ - cosine_recall@1
143
+ - cosine_recall@3
144
+ - cosine_recall@5
145
+ - cosine_recall@10
146
+ - cosine_ndcg@10
147
+ - cosine_mrr@10
148
+ - cosine_map@100
149
+ model-index:
150
+ - name: BGE base En v1.5 Phase 5
151
+ results:
152
+ - task:
153
+ type: information-retrieval
154
+ name: Information Retrieval
155
+ dataset:
156
+ name: dim 768
157
+ type: dim_768
158
+ metrics:
159
+ - type: cosine_accuracy@1
160
+ value: 0.011029411764705883
161
+ name: Cosine Accuracy@1
162
+ - type: cosine_accuracy@3
163
+ value: 0.022794117647058822
164
+ name: Cosine Accuracy@3
165
+ - type: cosine_accuracy@5
166
+ value: 0.03602941176470588
167
+ name: Cosine Accuracy@5
168
+ - type: cosine_accuracy@10
169
+ value: 0.07720588235294118
170
+ name: Cosine Accuracy@10
171
+ - type: cosine_precision@1
172
+ value: 0.011029411764705883
173
+ name: Cosine Precision@1
174
+ - type: cosine_precision@3
175
+ value: 0.0075980392156862735
176
+ name: Cosine Precision@3
177
+ - type: cosine_precision@5
178
+ value: 0.007205882352941177
179
+ name: Cosine Precision@5
180
+ - type: cosine_precision@10
181
+ value: 0.007720588235294118
182
+ name: Cosine Precision@10
183
+ - type: cosine_recall@1
184
+ value: 0.011029411764705883
185
+ name: Cosine Recall@1
186
+ - type: cosine_recall@3
187
+ value: 0.022794117647058822
188
+ name: Cosine Recall@3
189
+ - type: cosine_recall@5
190
+ value: 0.03602941176470588
191
+ name: Cosine Recall@5
192
+ - type: cosine_recall@10
193
+ value: 0.07720588235294118
194
+ name: Cosine Recall@10
195
+ - type: cosine_ndcg@10
196
+ value: 0.03623828989025581
197
+ name: Cosine Ndcg@10
198
+ - type: cosine_mrr@10
199
+ value: 0.024275501867413632
200
+ name: Cosine Mrr@10
201
+ - type: cosine_map@100
202
+ value: 0.0367475670482882
203
+ name: Cosine Map@100
204
+ ---
205
+
206
+ # BGE base En v1.5 Phase 5
207
+
208
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
209
+
210
+ ## Model Details
211
+
212
+ ### Model Description
213
+ - **Model Type:** Sentence Transformer
214
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
215
+ - **Maximum Sequence Length:** 512 tokens
216
+ - **Output Dimensionality:** 768 dimensions
217
+ - **Similarity Function:** Cosine Similarity
218
+ <!-- - **Training Dataset:** Unknown -->
219
+ - **Language:** en
220
+ - **License:** apache-2.0
221
+
222
+ ### Model Sources
223
+
224
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
225
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
226
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
227
+
228
+ ### Full Model Architecture
229
+
230
+ ```
231
+ SentenceTransformer(
232
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
233
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
234
+ (2): Normalize()
235
+ )
236
+ ```
237
+
238
+ ## Usage
239
+
240
+ ### Direct Usage (Sentence Transformers)
241
+
242
+ First install the Sentence Transformers library:
243
+
244
+ ```bash
245
+ pip install -U sentence-transformers
246
+ ```
247
+
248
+ Then you can load this model and run inference.
249
+ ```python
250
+ from sentence_transformers import SentenceTransformer
251
+
252
+ # Download from the 🤗 Hub
253
+ model = SentenceTransformer("RishuD7/bge-base-en-v1.5-76-keys-phase-6-exp_v1")
254
+ # Run inference
255
+ sentences = [
256
+ '31. HOLDING OVER. If Tenant remains in possession of the Leased Premises after\nexpiration of the Term, or after any termination of the Lease by Landlord without written agreement\nbetween the parties, Tenant shall be a tenant at sufferance and such tenancy shall be subject to the\nprovisions hereof, except that Rent for said holdover period shall be one hundred fifty percent (150%) of\nthe amount of Rent due in the last month of the Term. Nothing in this Section 29 shall be construed as\nconsent by Landlord to the possession of the Leased Premises by Tenant after the expiration of the Term\nor termination of the Lease by Landlord. ',
257
+ 'Holding Rent',
258
+ 'Does landlord confirm to no eminent domain on the property',
259
+ ]
260
+ embeddings = model.encode(sentences)
261
+ print(embeddings.shape)
262
+ # [3, 768]
263
+
264
+ # Get the similarity scores for the embeddings
265
+ similarities = model.similarity(embeddings, embeddings)
266
+ print(similarities.shape)
267
+ # [3, 3]
268
+ ```
269
+
270
+ <!--
271
+ ### Direct Usage (Transformers)
272
+
273
+ <details><summary>Click to see the direct usage in Transformers</summary>
274
+
275
+ </details>
276
+ -->
277
+
278
+ <!--
279
+ ### Downstream Usage (Sentence Transformers)
280
+
281
+ You can finetune this model on your own dataset.
282
+
283
+ <details><summary>Click to expand</summary>
284
+
285
+ </details>
286
+ -->
287
+
288
+ <!--
289
+ ### Out-of-Scope Use
290
+
291
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
292
+ -->
293
+
294
+ ## Evaluation
295
+
296
+ ### Metrics
297
+
298
+ #### Information Retrieval
299
+
300
+ * Dataset: `dim_768`
301
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
302
+
303
+ | Metric | Value |
304
+ |:--------------------|:-----------|
305
+ | cosine_accuracy@1 | 0.011 |
306
+ | cosine_accuracy@3 | 0.0228 |
307
+ | cosine_accuracy@5 | 0.036 |
308
+ | cosine_accuracy@10 | 0.0772 |
309
+ | cosine_precision@1 | 0.011 |
310
+ | cosine_precision@3 | 0.0076 |
311
+ | cosine_precision@5 | 0.0072 |
312
+ | cosine_precision@10 | 0.0077 |
313
+ | cosine_recall@1 | 0.011 |
314
+ | cosine_recall@3 | 0.0228 |
315
+ | cosine_recall@5 | 0.036 |
316
+ | cosine_recall@10 | 0.0772 |
317
+ | **cosine_ndcg@10** | **0.0362** |
318
+ | cosine_mrr@10 | 0.0243 |
319
+ | cosine_map@100 | 0.0367 |
320
+
321
+ <!--
322
+ ## Bias, Risks and Limitations
323
+
324
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
325
+ -->
326
+
327
+ <!--
328
+ ### Recommendations
329
+
330
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
331
+ -->
332
+
333
+ ## Training Details
334
+
335
+ ### Training Dataset
336
+
337
+ #### Unnamed Dataset
338
+
339
+
340
+ * Size: 8,290 training samples
341
+ * Columns: <code>positive</code> and <code>anchor</code>
342
+ * Approximate statistics based on the first 1000 samples:
343
+ | | positive | anchor |
344
+ |:--------|:-------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
345
+ | type | string | string |
346
+ | details | <ul><li>min: 98 tokens</li><li>mean: 298.43 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.99 tokens</li><li>max: 12 tokens</li></ul> |
347
+ * Samples:
348
+ | positive | anchor |
349
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------|
350
+ | <code>The Landlord shall have the right, at any time during the Term, to relocate the Premises to other premises (the "New Premises") in the Development on the same terms and conditions as are set out in this Lease provided that: (a) the Landlord shall first have given not less than 90 days notice to the Tenant; (b) the Landlord shall endeavour to ensure that the New Premises be of comparable size and quality to the Premises; (c) the Landlord shall pay the reasonable costs incurred by the Tenant for: (i) its physical move; (ii) the reconnection of existing communication lines; and (iii) the reordering of new printed material plates and the printing of an equal quantity and quality of printed material the tenant has in stock as the time of the relocation; (d) if the Rentable Area of the New Premises is not the same as the Rentable Area of the Premises, the total Basic Rent payable under this Lease (but not the Basic Rent per square foot of Rentable Area) shall be adjusted accordingly; and (e)...</code> | <code>Right to Relocate</code> |
351
+ | <code>39. Holdover: If Tenant shall hold over after the expiration of the Lease Term, without written agreement providing otherwise, Tenant shall be deemed to be a tenant at sufferance on month to month basis, at a monthly rental, payable in advance, equal to double the base rent then being paid by Tenant, and Tenant shall be bound by all of the other terms, covenants and agreements of the Lease. Nothing contained herein shall be construed to give Tenant the right to hold over at any time, extend the Term or prevent Landlord from immediate recovery of possession of the Premises by summary proceedings or otherwise and Landlord may exercise any and all remedies at law or in equity to recover possession of the Premises, as well as any damages incurred by Landlord, by Tenant's failure to vacate the Premises and deliver possession to Landlord as herein provided.</code> | <code>Holding Over</code> |
352
+ | <code>30. HOLDING OVER. If Tenant remains in possession of the Leased Premises after expiration of the Term, or after any termination of the Lease by Landlord without written agreement between the parties, Tenant shall be a tenant at sufferance and such tenancy shall be subject to the provisions hereof, except that Gross Rent for said holdover period shall be one hundred twenty five percent (125%) of the amount of Gross Rent due in the last month of the Term. Nothing in this Section 30 shall be construed as consent by Landlord to the possession of the Leased Premises by Tenant after the expiration of the Term or termination of the Lease by Landlord. In the event Tenant provides written notice to Landlord of its intent to holdover at least sixty (60) days prior to the end of the Term and Landlord does not object to such request within thirty (30) days after receipt thereof, it shall be deemed that Landlord has consented to such holdover and this Lease shall continue on a month-to-month basis ...</code> | <code>Holding Over</code> |
353
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
354
+ ```json
355
+ {
356
+ "scale": 20.0,
357
+ "similarity_fct": "cos_sim"
358
+ }
359
+ ```
360
+
361
+ ### Training Hyperparameters
362
+ #### Non-Default Hyperparameters
363
+
364
+ - `eval_strategy`: epoch
365
+ - `per_device_train_batch_size`: 32
366
+ - `per_device_eval_batch_size`: 16
367
+ - `gradient_accumulation_steps`: 16
368
+ - `learning_rate`: 2e-05
369
+ - `num_train_epochs`: 30
370
+ - `lr_scheduler_type`: cosine
371
+ - `warmup_ratio`: 0.1
372
+ - `tf32`: False
373
+ - `load_best_model_at_end`: True
374
+ - `optim`: adamw_torch_fused
375
+ - `batch_sampler`: no_duplicates
376
+
377
+ #### All Hyperparameters
378
+ <details><summary>Click to expand</summary>
379
+
380
+ - `overwrite_output_dir`: False
381
+ - `do_predict`: False
382
+ - `eval_strategy`: epoch
383
+ - `prediction_loss_only`: True
384
+ - `per_device_train_batch_size`: 32
385
+ - `per_device_eval_batch_size`: 16
386
+ - `per_gpu_train_batch_size`: None
387
+ - `per_gpu_eval_batch_size`: None
388
+ - `gradient_accumulation_steps`: 16
389
+ - `eval_accumulation_steps`: None
390
+ - `torch_empty_cache_steps`: None
391
+ - `learning_rate`: 2e-05
392
+ - `weight_decay`: 0.0
393
+ - `adam_beta1`: 0.9
394
+ - `adam_beta2`: 0.999
395
+ - `adam_epsilon`: 1e-08
396
+ - `max_grad_norm`: 1.0
397
+ - `num_train_epochs`: 30
398
+ - `max_steps`: -1
399
+ - `lr_scheduler_type`: cosine
400
+ - `lr_scheduler_kwargs`: {}
401
+ - `warmup_ratio`: 0.1
402
+ - `warmup_steps`: 0
403
+ - `log_level`: passive
404
+ - `log_level_replica`: warning
405
+ - `log_on_each_node`: True
406
+ - `logging_nan_inf_filter`: True
407
+ - `save_safetensors`: True
408
+ - `save_on_each_node`: False
409
+ - `save_only_model`: False
410
+ - `restore_callback_states_from_checkpoint`: False
411
+ - `no_cuda`: False
412
+ - `use_cpu`: False
413
+ - `use_mps_device`: False
414
+ - `seed`: 42
415
+ - `data_seed`: None
416
+ - `jit_mode_eval`: False
417
+ - `use_ipex`: False
418
+ - `bf16`: False
419
+ - `fp16`: False
420
+ - `fp16_opt_level`: O1
421
+ - `half_precision_backend`: auto
422
+ - `bf16_full_eval`: False
423
+ - `fp16_full_eval`: False
424
+ - `tf32`: False
425
+ - `local_rank`: 0
426
+ - `ddp_backend`: None
427
+ - `tpu_num_cores`: None
428
+ - `tpu_metrics_debug`: False
429
+ - `debug`: []
430
+ - `dataloader_drop_last`: False
431
+ - `dataloader_num_workers`: 0
432
+ - `dataloader_prefetch_factor`: None
433
+ - `past_index`: -1
434
+ - `disable_tqdm`: False
435
+ - `remove_unused_columns`: True
436
+ - `label_names`: None
437
+ - `load_best_model_at_end`: True
438
+ - `ignore_data_skip`: False
439
+ - `fsdp`: []
440
+ - `fsdp_min_num_params`: 0
441
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
442
+ - `fsdp_transformer_layer_cls_to_wrap`: None
443
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
444
+ - `deepspeed`: None
445
+ - `label_smoothing_factor`: 0.0
446
+ - `optim`: adamw_torch_fused
447
+ - `optim_args`: None
448
+ - `adafactor`: False
449
+ - `group_by_length`: False
450
+ - `length_column_name`: length
451
+ - `ddp_find_unused_parameters`: None
452
+ - `ddp_bucket_cap_mb`: None
453
+ - `ddp_broadcast_buffers`: False
454
+ - `dataloader_pin_memory`: True
455
+ - `dataloader_persistent_workers`: False
456
+ - `skip_memory_metrics`: True
457
+ - `use_legacy_prediction_loop`: False
458
+ - `push_to_hub`: False
459
+ - `resume_from_checkpoint`: None
460
+ - `hub_model_id`: None
461
+ - `hub_strategy`: every_save
462
+ - `hub_private_repo`: False
463
+ - `hub_always_push`: False
464
+ - `gradient_checkpointing`: False
465
+ - `gradient_checkpointing_kwargs`: None
466
+ - `include_inputs_for_metrics`: False
467
+ - `eval_do_concat_batches`: True
468
+ - `fp16_backend`: auto
469
+ - `push_to_hub_model_id`: None
470
+ - `push_to_hub_organization`: None
471
+ - `mp_parameters`:
472
+ - `auto_find_batch_size`: False
473
+ - `full_determinism`: False
474
+ - `torchdynamo`: None
475
+ - `ray_scope`: last
476
+ - `ddp_timeout`: 1800
477
+ - `torch_compile`: False
478
+ - `torch_compile_backend`: None
479
+ - `torch_compile_mode`: None
480
+ - `dispatch_batches`: None
481
+ - `split_batches`: None
482
+ - `include_tokens_per_second`: False
483
+ - `include_num_input_tokens_seen`: False
484
+ - `neftune_noise_alpha`: None
485
+ - `optim_target_modules`: None
486
+ - `batch_eval_metrics`: False
487
+ - `eval_on_start`: False
488
+ - `eval_use_gather_object`: False
489
+ - `prompts`: None
490
+ - `batch_sampler`: no_duplicates
491
+ - `multi_dataset_batch_sampler`: proportional
492
+
493
+ </details>
494
+
495
+ ### Training Logs
496
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 |
497
+ |:----------:|:-------:|:-------------:|:----------------------:|
498
+ | 0.6154 | 10 | 2.5422 | - |
499
+ | 1.2308 | 20 | 1.3661 | - |
500
+ | 1.8462 | 30 | 0.1879 | - |
501
+ | 2.4615 | 40 | 0.0 | - |
502
+ | 3.0769 | 50 | 0.0 | - |
503
+ | 3.3846 | 55 | - | 0.0252 |
504
+ | 1.2846 | 60 | 0.8868 | - |
505
+ | 1.9 | 70 | 1.4243 | - |
506
+ | 2.5154 | 80 | 0.1644 | - |
507
+ | 3.1308 | 90 | 0.0041 | - |
508
+ | 3.7462 | 100 | 0.0 | - |
509
+ | 4.3615 | 110 | 0.0 | 0.0301 |
510
+ | 2.5692 | 120 | 1.0665 | - |
511
+ | 3.1846 | 130 | 0.4817 | - |
512
+ | 3.8 | 140 | 0.0021 | - |
513
+ | 4.4154 | 150 | 0.0 | - |
514
+ | 5.0308 | 160 | 0.0 | - |
515
+ | 5.4 | 166 | - | 0.0328 |
516
+ | 3.2385 | 170 | 0.4318 | - |
517
+ | 3.8538 | 180 | 0.7595 | - |
518
+ | 4.4692 | 190 | 0.0737 | - |
519
+ | 5.0846 | 200 | 0.0004 | - |
520
+ | 5.7 | 210 | 0.0 | - |
521
+ | 6.3154 | 220 | 0.0 | - |
522
+ | 6.3769 | 221 | - | 0.0354 |
523
+ | 4.5231 | 230 | 0.736 | - |
524
+ | 5.1385 | 240 | 0.3332 | - |
525
+ | 5.7538 | 250 | 0.0008 | - |
526
+ | 6.3692 | 260 | 0.0 | - |
527
+ | 6.9846 | 270 | 0.0 | - |
528
+ | 7.3538 | 276 | - | 0.0336 |
529
+ | 5.1923 | 280 | 0.3014 | - |
530
+ | 5.8077 | 290 | 0.5931 | - |
531
+ | 6.4231 | 300 | 0.0735 | - |
532
+ | 7.0385 | 310 | 0.0002 | - |
533
+ | 7.6538 | 320 | 0.0 | - |
534
+ | 8.2692 | 330 | 0.0 | - |
535
+ | **8.3923** | **332** | **-** | **0.0374** |
536
+ | 6.4769 | 340 | 0.5984 | - |
537
+ | 7.0923 | 350 | 0.2797 | - |
538
+ | 7.7077 | 360 | 0.0005 | - |
539
+ | 8.3231 | 370 | 0.0 | - |
540
+ | 8.9385 | 380 | 0.0 | - |
541
+ | 9.3692 | 387 | - | 0.0355 |
542
+ | 7.1462 | 390 | 0.1997 | - |
543
+ | 7.7615 | 400 | 0.5201 | - |
544
+ | 8.3769 | 410 | 0.0799 | - |
545
+ | 8.9923 | 420 | 0.0001 | - |
546
+ | 9.6077 | 430 | 0.0 | - |
547
+ | 10.2231 | 440 | 0.0 | - |
548
+ | 10.4077 | 443 | - | 0.0362 |
549
+ | 8.4308 | 450 | 0.5072 | - |
550
+ | 9.0462 | 460 | 0.2583 | - |
551
+ | 9.6615 | 470 | 0.0005 | - |
552
+ | 10.2769 | 480 | 0.0 | 0.0362 |
553
+
554
+ * The bold row denotes the saved checkpoint.
555
+
556
+ ### Framework Versions
557
+ - Python: 3.11.11
558
+ - Sentence Transformers: 3.3.1
559
+ - Transformers: 4.43.1
560
+ - PyTorch: 2.5.1+cu124
561
+ - Accelerate: 1.2.1
562
+ - Datasets: 2.19.1
563
+ - Tokenizers: 0.19.1
564
+
565
+ ## Citation
566
+
567
+ ### BibTeX
568
+
569
+ #### Sentence Transformers
570
+ ```bibtex
571
+ @inproceedings{reimers-2019-sentence-bert,
572
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
573
+ author = "Reimers, Nils and Gurevych, Iryna",
574
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
575
+ month = "11",
576
+ year = "2019",
577
+ publisher = "Association for Computational Linguistics",
578
+ url = "https://arxiv.org/abs/1908.10084",
579
+ }
580
+ ```
581
+
582
+ #### MultipleNegativesRankingLoss
583
+ ```bibtex
584
+ @misc{henderson2017efficient,
585
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
586
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
587
+ year={2017},
588
+ eprint={1705.00652},
589
+ archivePrefix={arXiv},
590
+ primaryClass={cs.CL}
591
+ }
592
+ ```
593
+
594
+ <!--
595
+ ## Glossary
596
+
597
+ *Clearly define terms in order to be accessible across audiences.*
598
+ -->
599
+
600
+ <!--
601
+ ## Model Card Authors
602
+
603
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
604
+ -->
605
+
606
+ <!--
607
+ ## Model Card Contact
608
+
609
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
610
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.43.1",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.43.1",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36b5254b97b1330997e306d98e762e49c94871457d176f9ff53bdda866bed0cb
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff