bobox commited on
Commit
d1dc986
·
verified ·
1 Parent(s): ca0ee78

all layer trained for every step.AdaptiveLayerLoss(model=model,

Browse files

loss=train_loss,
n_layers_per_step = -1,
last_layer_weight = 1.5,
prior_layers_weight= 0.1,
kl_div_weight = 0.5,
kl_temperature= 1,
)

num_epochs = 2
learning_rate = 2e-5
warmup_ratio=0.25

weight_decay = 1e-6

schedule = "cosine_with_restarts"
num_cycles = 3

Files changed (2) hide show
  1. README.md +450 -105
  2. pytorch_model.bin +2 -2
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  - sentence-similarity
8
  - feature-extraction
9
  - generated_from_trainer
10
- - dataset_size:975301
11
  - loss:AdaptiveLayerLoss
12
  - loss:CoSENTLoss
13
  - loss:GISTEmbedLoss
@@ -30,36 +30,182 @@ datasets:
30
  - sentence-transformers/trivia-qa
31
  - sentence-transformers/quora-duplicates
32
  - sentence-transformers/gooaq
 
 
 
 
 
 
 
 
 
 
 
33
  widget:
34
- - source_sentence: Centrosome-independent mitotic spindle formation in vertebrates.
 
35
  sentences:
36
- - Birds pair up with the same bird in mating season.
37
- - We use voltage to keep track of electric potential energy.
38
- - A mitotic spindle forms from the centrosomes.
39
- - source_sentence: A dog carrying a stick in its mouth runs through a snow-covered
40
- field.
41
  sentences:
42
- - The children played on the floor.
43
  - A pair of people play video games together on a couch.
44
- - A animal carried a stick through a snow covered field.
45
- - source_sentence: A guy on a skateboard, jumping off some steps.
 
46
  sentences:
47
- - A woman is making music.
48
- - a guy with a skateboard making a jump
49
- - A dog holds an object in the water.
50
- - source_sentence: A photographer with bushy dark hair takes a photo of a skateboarder
51
- at an indoor park.
52
  sentences:
53
- - The person with the camera photographs the person skating.
54
- - A man starring at a piece of paper.
55
- - The man is riding a bike in sand.
56
- - source_sentence: Why did oil start getting priced in terms of gold?
57
  sentences:
58
- - Because oil was priced in dollars, oil producers' real income decreased.
59
- - This allows all set top boxes in a household to share recordings and other media.
60
- - Only the series from 2009 onwards are available on Blu-ray, except for the 1970
61
- story Spearhead from Space, released in July 2013.
 
 
 
62
  pipeline_tag: sentence-similarity
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  ---
64
 
65
  # SentenceTransformer based on microsoft/deberta-v3-small
@@ -127,9 +273,9 @@ from sentence_transformers import SentenceTransformer
127
  model = SentenceTransformer("bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-AllSoft")
128
  # Run inference
129
  sentences = [
130
- 'Why did oil start getting priced in terms of gold?',
131
- "Because oil was priced in dollars, oil producers' real income decreased.",
132
- 'This allows all set top boxes in a household to share recordings and other media.',
133
  ]
134
  embeddings = model.encode(sentences)
135
  print(embeddings.shape)
@@ -165,6 +311,78 @@ You can finetune this model on your own dataset.
165
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
166
  -->
167
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
  <!--
169
  ## Bias, Risks and Limitations
170
 
@@ -184,7 +402,7 @@ You can finetune this model on your own dataset.
184
  #### nli-pairs
185
 
186
  * Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
187
- * Size: 100,000 training samples
188
  * Columns: <code>sentence1</code> and <code>sentence2</code>
189
  * Approximate statistics based on the first 1000 samples:
190
  | | sentence1 | sentence2 |
@@ -236,19 +454,19 @@ You can finetune this model on your own dataset.
236
  #### vitaminc-pairs
237
 
238
  * Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
239
- * Size: 50,066 training samples
240
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
241
  * Approximate statistics based on the first 1000 samples:
242
- | | label | sentence1 | sentence2 |
243
- |:--------|:-----------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
244
- | type | int | string | string |
245
- | details | <ul><li>1: 100.00%</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 17.29 tokens</li><li>max: 86 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 37.41 tokens</li><li>max: 249 tokens</li></ul> |
246
  * Samples:
247
- | label | sentence1 | sentence2 |
248
- |:---------------|:--------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
249
- | <code>1</code> | <code>Sykes agreed to sponsor one of Ranulph Fiennes 's expeditions .</code> | <code>Sykes had agreed to sponsor one of his expeditions..He is also a member of the libertarian pressure group The Freedom Association .</code> |
250
- | <code>1</code> | <code>ANTIFA has followers in Berkeley .</code> | <code>It is one of the most politically liberal cities in the United States ( e.g . ANTIFA ) .</code> |
251
- | <code>1</code> | <code>In Saints Row IV , Zinyak is able to use time-travel technology to return to Earth .</code> | <code>Zinjai says they can not restore Earth , but can use time-travel technology to return to Earth , clarifying that Zinyak had used this technology to collect his favorite historical figures , keeping them in suspended animation .</code> |
252
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
253
  ```json
254
  {
@@ -264,19 +482,19 @@ You can finetune this model on your own dataset.
264
  #### qnli-contrastive
265
 
266
  * Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
267
- * Size: 100,000 training samples
268
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
269
  * Approximate statistics based on the first 1000 samples:
270
- | | sentence1 | sentence2 | label |
271
- |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------|
272
- | type | string | string | int |
273
- | details | <ul><li>min: 6 tokens</li><li>mean: 13.73 tokens</li><li>max: 29 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 35.68 tokens</li><li>max: 267 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
274
  * Samples:
275
- | sentence1 | sentence2 | label |
276
- |:-----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
277
- | <code>When did drama first begin in history?</code> | <code>The Classical era also saw the dawn of drama.</code> | <code>0</code> |
278
- | <code>What is caused by using or selling a patented invention without permission?</code> | <code>There is safe harbor in many jurisdictions to use a patented invention for research.</code> | <code>0</code> |
279
- | <code>What year was the Presidential Proclamation lifted?</code> | <code>Before escaping, the UN Command forces razed most of Hungnam city, especially the port facilities; and on 16 December 1950, President Truman declared a national emergency with Presidential Proclamation No. 2914, 3 C.F.R. 99 (1953), which remained in force until 14 September 1978.[b]</code> | <code>0</code> |
280
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
281
  ```json
282
  {
@@ -292,19 +510,19 @@ You can finetune this model on your own dataset.
292
  #### scitail-pairs-qa
293
 
294
  * Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
295
- * Size: 14,987 training samples
296
  * Columns: <code>sentence2</code> and <code>sentence1</code>
297
  * Approximate statistics based on the first 1000 samples:
298
- | | sentence2 | sentence1 |
299
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
300
- | type | string | string |
301
- | details | <ul><li>min: 7 tokens</li><li>mean: 15.84 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.04 tokens</li><li>max: 34 tokens</li></ul> |
302
  * Samples:
303
- | sentence2 | sentence1 |
304
- |:----------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------|
305
- | <code>A clade is one way of classifying organisms.</code> | <code>What is one way of classifying organisms called?</code> |
306
- | <code>Archimedes' law explains why a ship weighing thousands of metric tons floats on water.</code> | <code>Which law explains why a ship weighing thousands of metric tons floats on water?</code> |
307
- | <code>Being able to read is an example of a learned trait.</code> | <code>An example of a learned trait is</code> |
308
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
309
  ```json
310
  {
@@ -320,19 +538,19 @@ You can finetune this model on your own dataset.
320
  #### scitail-pairs-pos
321
 
322
  * Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
323
- * Size: 8,600 training samples
324
  * Columns: <code>sentence1</code> and <code>sentence2</code>
325
  * Approximate statistics based on the first 1000 samples:
326
- | | sentence1 | sentence2 |
327
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
328
- | type | string | string |
329
- | details | <ul><li>min: 8 tokens</li><li>mean: 23.48 tokens</li><li>max: 65 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 15.54 tokens</li><li>max: 38 tokens</li></ul> |
330
  * Samples:
331
- | sentence1 | sentence2 |
332
- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|
333
- | <code>Biological pollutants, including molds, bacteria, pollen, dust mites and animal dander promote poor indoor air quality and may be a major cause of days lost from work and school, according to the American Lung Association.</code> | <code>Molds, pollen, and pet dander are examples of air pollution with biological sources.</code> |
334
- | <code>Using an endoscope, the plastic surgeon can smooth and tighten the skin and muscles through very small incisions that are easily concealed.</code> | <code>Endoscopes are used to explore the body through various orifices or minor incisions.</code> |
335
- | <code>nucleus The central part of an atom that contains protons, neutrons and other particles.</code> | <code>Protons and neutrons are located in the central nucleus.</code> |
336
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
337
  ```json
338
  {
@@ -348,19 +566,19 @@ You can finetune this model on your own dataset.
348
  #### xsum-pairs
349
 
350
  * Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
351
- * Size: 100,000 training samples
352
  * Columns: <code>sentence1</code> and <code>sentence2</code>
353
  * Approximate statistics based on the first 1000 samples:
354
- | | sentence1 | sentence2 |
355
- |:--------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
356
- | type | string | string |
357
- | details | <ul><li>min: 29 tokens</li><li>mean: 351.84 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 27.18 tokens</li><li>max: 66 tokens</li></ul> |
358
  * Samples:
359
- | sentence1 | sentence2 |
360
- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|
361
- | <code>The Scottish champions reached the group stage by beating Hapoel Beer Sheva 5-4 on aggregate.<br>They will open their campaign away to Barcelona on Tuesday 13 September.<br>"It is a great draw - three great teams," said Celtic left-back Kieran Tierney. "That is why we are in the Champions League, to play the best."<br>After their opener in the Nou Camp, Celtic will host Manchester City on 28 September, before home and away games against Monchengladbach on 19 October and 1 November. They then entertain Barcelona on 23 November before concluding away to City on 6 December.<br>Tierney, 19, will be experiencing the group stage for the first time, and is relishing the prospect of confronting Barcelona superstar Lionel Messi<br>"It is just brilliant to watch [Barcelona], and hopefully I can play against him," the youngster added.<br>"We are always hopeful. We prepare ourselves right and we do the right work on the training field. Anything can happen."<br>Celtic have progressed to the last 16 of the competition on three occasions, most recently in season 2012-13.<br>That season, under Neil Lennon, Celtic beat Barca 2-1 in Glasgow at the group stage but lost by the same scoreline away from home.<br>The two teams met again in the following season's group stage, the Spanish side winning 1-0 in Glasgow and 6-1 at home.<br>City - fourth in England's Premier League last season - and German side Monchengladbach have never previously faced Celtic in the competition.<br>"It could've been easier," Celtic chief executive Peter Lawwell told BBC Radio Scotland's Sportsound. "There's some real glamour ties in there - some really great nights ahead at Celtic Park.<br>"I don't think they will be relishing coming to us with the supporters, the atmosphere, the occasion we put on there.<br>"It couldn't be any more difficult but [manager] Brendan [Rodgers] has got off to a great start. It has been a remarkable first couple of months for him.<br>"We will attempt to get one more player in before next Wednesday's [transfer] deadline."<br>Former Barca boss Pep Guardiola took over at City, who reached the semi-finals last season, in the summer while recent Celtic signing Kolo Toure is a former City player whose brother is still at the Manchester club and also once played for Barcelona.<br>Rodgers, who took over from Ronny Deila in the summer, previously managed Liverpool at this stage of the competition.</code> | <code>Celtic have been drawn to face Barcelona, Manchester City and Borussia Monchengladbach in Champions League Group C.</code> |
362
- | <code>Lesley Titcomb said: "We were not, as I'm aware, advised in advance. We learnt about the sale from the newspapers."<br>She was giving evidence to MPs about the collapse last month of BHS, whose pension fund had a £571m deficit.<br>The pensions watchdog was first in discussions with BHS in 2009 about its pension fund deficit.<br>As soon as the sale of BHS by Sir Philip Green to a consortium called Retail Acquisitions was announced, Ms Titcomb said the Regulator opened an anti-avoidance case to determine whether the previous owners should be pursued to make up the fund's shortfall.<br>Richard Fuller, one of the MPs attending the joint hearing of the Work and Pensions and Business select committees, said the pensions watchdog did "not sound like much of a regulator".<br>Ms Titcomb disagreed and said it would be inappropriate to put a greater burden on most employers who behave properly.<br>When pension funds had deficits she said the company it belonged to had to be given sufficient time to rectify that situation. It was difficult when people were "irresponsible", but that was not the case in the vast majority of situations, Ms Titcomb told MPs.<br>However, the head of the Pension Protection Fund (PPF) earlier told MPs that rescue plans to erase deficits for pension schemes should have time limits.<br>Alan Rubenstein said that in 2012 BHS's pension fund trustees had submitted a 23-year plan to return the fund to a surplus.<br>He said the average length of a recovery plan was nine years: "We've learned that 23-year recovery plans are rather ambitious."<br>The PPF wanted recovery periods to be "as short as possible", but he acknowledged that the Pensions Regulator did not want to "push companies over the edge".<br>He told MPs that the cost to the PPF to rescue BHS's pension fund would be about £275m - a sum that he said would not affect the fund's finances.<br>The PPF, which protects pensioners in the event of a company failing with a pension fund that is in deficit, is funded by a levy on all UK pension funds.<br>Mr Rubenstein said that the Pensions Regulator should have more power to intervene in takeovers when it was concerned about a company's pension scheme deficit.<br>He told MPs that a subsidiary of the Arcadia Group called Davenbush withdrew a guarantee for the BHS pension scheme in 2012. Arcadia is the retail empire controlled by Sir Philip.<br>Sir Philip, the owner of Top Shop, has agreed next month to answer MPs' questions about the sale of BHS, which he owned for 15 years.<br>Tom McPhail, head of retirement policy at Hargreaves Lansdown, said the UK's pension system was "creaking at the seams".<br>"The revelations that the BHS scheme had a 23-year deficit reduction programme and that the Regulator didn't know about the sale of BHS until announced in the papers raises uncomfortable questions about the adequacy of the protection of pension scheme members generally," he said.<br>"Pension investors, whether in final salary schemes or money purchase arrangements, have a right to expect their retirement savings to be protected by professional managers operating to high regulatory standards."<br>Separately, the Work and Pensions Committee said it had invited the former and current Chair of Trustees and Trustees of the BHS pension fund to give evidence on 25 May.</code> | <code>The head of the Pensions Regulator learned about the sale of BHS for £1 last year through the media, MPs heard.</code> |
363
- | <code>The 54-year-old takes charge of a team in need of a lift, with Pirates lying 10th in the league and having recently suffered a club record 6-0 defeat.<br>Jonevret has been named coach of the year in both Sweden and Norway.<br>"It is my sincere hope and desire that I can repay the faith the Chairman has shown in me," the former assistant coach of Sweden said.<br>"I greatly appreciate the opportunity to work as head coach of Pirates," he told the club website.<br>Jonevret, who won the Swedish league and cup double with Djurgardens in 2005, was then voted coach of the year in Norway in 2009 after his work with Molde.<br>His most recent job was with Viking Stavanger, who he left in November 2016 after four years in the job.<br>The former player takes over with immediate effect from Augusto Palacios, who had been working as interim coach since November following the dismissal of Muhsin Ertugral.<br>Pirates chairman Irvin Khoza said Jonevret will bring 'a wealth of experience and professionalism.'<br>"One other aspect which attracted us to Mr Jonevret was his loyalty to the clubs he worked for," Khoza told Pirates website.<br>"In an industry where coaches around the world have become accustomed to moving from one club to another, Jonevret has shown in (his) previous clubs that he is someone who wants to be involved in long term projects and not quick fixes."<br>"Success is a journey and not a sprint. It is our desire that the club goes back to its former glory and in Jonevret, we believe that we have someone who can achieve that."<br>Sweden's assistant coach between 2011 and 12, Jonevret will be joined on the bench by Benson Mhlongo and Herold Legodi as assistant coaches.<br>Pirates' only realistic chance of success this season is winning the South African FA Cup, which begins next month.<br>The Johannesburg club won the African Champions League in 1995 and were runners-up in the Confederation Cup in 2013.<br>Earlier this month, the club's supporters rioted when Pirates lost 6-0 to rivals Mamelodi Sundowns, the reigning African champions.</code> | <code>South Africa's Orlando Pirates have named Sweden's Kjell Jonevret as their new coach.</code> |
364
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
365
  ```json
366
  {
@@ -376,7 +594,7 @@ You can finetune this model on your own dataset.
376
  #### compression-pairs
377
 
378
  * Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
379
- * Size: 100,000 training samples
380
  * Columns: <code>sentence1</code> and <code>sentence2</code>
381
  * Approximate statistics based on the first 1000 samples:
382
  | | sentence1 | sentence2 |
@@ -394,7 +612,7 @@ You can finetune this model on your own dataset.
394
  {
395
  "loss": "MultipleNegativesSymmetricRankingLoss",
396
  "n_layers_per_step": -1,
397
- "last_layer_weight": 2,
398
  "prior_layers_weight": 0.1,
399
  "kl_div_weight": 0.5,
400
  "kl_temperature": 1
@@ -404,7 +622,7 @@ You can finetune this model on your own dataset.
404
  #### sciq_pairs
405
 
406
  * Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
407
- * Size: 11,679 training samples
408
  * Columns: <code>sentence1</code> and <code>sentence2</code>
409
  * Approximate statistics based on the first 1000 samples:
410
  | | sentence1 | sentence2 |
@@ -432,7 +650,7 @@ You can finetune this model on your own dataset.
432
  #### qasc_pairs
433
 
434
  * Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
435
- * Size: 8,134 training samples
436
  * Columns: <code>id</code>, <code>sentence1</code>, and <code>sentence2</code>
437
  * Approximate statistics based on the first 1000 samples:
438
  | | id | sentence1 | sentence2 |
@@ -488,7 +706,7 @@ You can finetune this model on your own dataset.
488
  #### msmarco_pairs
489
 
490
  * Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
491
- * Size: 100,000 training samples
492
  * Columns: <code>sentence1</code> and <code>sentence2</code>
493
  * Approximate statistics based on the first 1000 samples:
494
  | | sentence1 | sentence2 |
@@ -516,7 +734,7 @@ You can finetune this model on your own dataset.
516
  #### nq_pairs
517
 
518
  * Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
519
- * Size: 100,000 training samples
520
  * Columns: <code>sentence1</code> and <code>sentence2</code>
521
  * Approximate statistics based on the first 1000 samples:
522
  | | sentence1 | sentence2 |
@@ -544,7 +762,7 @@ You can finetune this model on your own dataset.
544
  #### trivia_pairs
545
 
546
  * Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
547
- * Size: 73,346 training samples
548
  * Columns: <code>sentence1</code> and <code>sentence2</code>
549
  * Approximate statistics based on the first 1000 samples:
550
  | | sentence1 | sentence2 |
@@ -572,7 +790,7 @@ You can finetune this model on your own dataset.
572
  #### quora_pairs
573
 
574
  * Dataset: [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
575
- * Size: 100,000 training samples
576
  * Columns: <code>sentence1</code> and <code>sentence2</code>
577
  * Approximate statistics based on the first 1000 samples:
578
  | | sentence1 | sentence2 |
@@ -600,7 +818,7 @@ You can finetune this model on your own dataset.
600
  #### gooaq_pairs
601
 
602
  * Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
603
- * Size: 100,000 training samples
604
  * Columns: <code>sentence1</code> and <code>sentence2</code>
605
  * Approximate statistics based on the first 1000 samples:
606
  | | sentence1 | sentence2 |
@@ -630,13 +848,13 @@ You can finetune this model on your own dataset.
630
  #### nli-pairs
631
 
632
  * Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
633
- * Size: 6,808 evaluation samples
634
  * Columns: <code>anchor</code> and <code>positive</code>
635
  * Approximate statistics based on the first 1000 samples:
636
  | | anchor | positive |
637
  |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
638
  | type | string | string |
639
- | details | <ul><li>min: 5 tokens</li><li>mean: 17.64 tokens</li><li>max: 63 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.67 tokens</li><li>max: 29 tokens</li></ul> |
640
  * Samples:
641
  | anchor | positive |
642
  |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|
@@ -658,13 +876,13 @@ You can finetune this model on your own dataset.
658
  #### scitail-pairs-pos
659
 
660
  * Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
661
- * Size: 1,304 evaluation samples
662
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
663
  * Approximate statistics based on the first 1000 samples:
664
- | | sentence1 | sentence2 | label |
665
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
666
- | type | string | string | int |
667
- | details | <ul><li>min: 5 tokens</li><li>mean: 22.52 tokens</li><li>max: 67 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.34 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>0: ~47.50%</li><li>1: ~52.50%</li></ul> |
668
  * Samples:
669
  | sentence1 | sentence2 | label |
670
  |:----------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------|
@@ -686,13 +904,13 @@ You can finetune this model on your own dataset.
686
  #### qnli-contrastive
687
 
688
  * Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
689
- * Size: 5,463 evaluation samples
690
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
691
  * Approximate statistics based on the first 1000 samples:
692
  | | sentence1 | sentence2 | label |
693
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
694
  | type | string | string | int |
695
- | details | <ul><li>min: 6 tokens</li><li>mean: 14.13 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 36.58 tokens</li><li>max: 225 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
696
  * Samples:
697
  | sentence1 | sentence2 | label |
698
  |:--------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
@@ -716,17 +934,17 @@ You can finetune this model on your own dataset.
716
 
717
  - `eval_strategy`: steps
718
  - `per_device_train_batch_size`: 28
719
- - `per_device_eval_batch_size`: 16
720
- - `learning_rate`: 3e-05
721
  - `weight_decay`: 1e-06
722
- - `num_train_epochs`: 1
723
  - `lr_scheduler_type`: cosine_with_restarts
724
  - `lr_scheduler_kwargs`: {'num_cycles': 3}
725
- - `warmup_ratio`: 0.2
726
  - `save_safetensors`: False
727
  - `fp16`: True
728
  - `push_to_hub`: True
729
- - `hub_model_id`: bobox/DeBERTaV3-small-SenTra-AdaptiveLayers-AllSoft-HighTemp-n
730
  - `hub_strategy`: checkpoint
731
  - `batch_sampler`: no_duplicates
732
 
@@ -738,22 +956,22 @@ You can finetune this model on your own dataset.
738
  - `eval_strategy`: steps
739
  - `prediction_loss_only`: True
740
  - `per_device_train_batch_size`: 28
741
- - `per_device_eval_batch_size`: 16
742
  - `per_gpu_train_batch_size`: None
743
  - `per_gpu_eval_batch_size`: None
744
  - `gradient_accumulation_steps`: 1
745
  - `eval_accumulation_steps`: None
746
- - `learning_rate`: 3e-05
747
  - `weight_decay`: 1e-06
748
  - `adam_beta1`: 0.9
749
  - `adam_beta2`: 0.999
750
  - `adam_epsilon`: 1e-08
751
  - `max_grad_norm`: 1.0
752
- - `num_train_epochs`: 1
753
  - `max_steps`: -1
754
  - `lr_scheduler_type`: cosine_with_restarts
755
  - `lr_scheduler_kwargs`: {'num_cycles': 3}
756
- - `warmup_ratio`: 0.2
757
  - `warmup_steps`: 0
758
  - `log_level`: passive
759
  - `log_level_replica`: warning
@@ -812,7 +1030,7 @@ You can finetune this model on your own dataset.
812
  - `use_legacy_prediction_loop`: False
813
  - `push_to_hub`: True
814
  - `resume_from_checkpoint`: None
815
- - `hub_model_id`: bobox/DeBERTaV3-small-SenTra-AdaptiveLayers-AllSoft-HighTemp-n
816
  - `hub_strategy`: checkpoint
817
  - `hub_private_repo`: False
818
  - `hub_always_push`: False
@@ -844,6 +1062,133 @@ You can finetune this model on your own dataset.
844
 
845
  </details>
846
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
847
  ### Framework Versions
848
  - Python: 3.10.13
849
  - Sentence Transformers: 3.0.1
 
7
  - sentence-similarity
8
  - feature-extraction
9
  - generated_from_trainer
10
+ - dataset_size:78183
11
  - loss:AdaptiveLayerLoss
12
  - loss:CoSENTLoss
13
  - loss:GISTEmbedLoss
 
30
  - sentence-transformers/trivia-qa
31
  - sentence-transformers/quora-duplicates
32
  - sentence-transformers/gooaq
33
+ metrics:
34
+ - pearson_cosine
35
+ - spearman_cosine
36
+ - pearson_manhattan
37
+ - spearman_manhattan
38
+ - pearson_euclidean
39
+ - spearman_euclidean
40
+ - pearson_dot
41
+ - spearman_dot
42
+ - pearson_max
43
+ - spearman_max
44
  widget:
45
+ - source_sentence: The X and Y chromosomes in human beings that determine the sex
46
+ of an individual.
47
  sentences:
48
+ - A glacier leaves behind bare rock when it retreats.
49
+ - Prokaryotes are unicellular organisms that lack organelles surrounded by membranes.
50
+ - Mammalian sex determination is determined genetically by the presence of chromosomes
51
+ identified by the letters x and y.
52
+ - source_sentence: Police officer with riot shield stands in front of crowd.
53
  sentences:
54
+ - A police officer stands in front of a crowd.
55
  - A pair of people play video games together on a couch.
56
+ - People are outside digging a hole.
57
+ - source_sentence: A young girl sitting on a white comforter on a bed covered with
58
+ clothing, holding a yellow stuffed duck.
59
  sentences:
60
+ - A man standing in a room is pointing up.
61
+ - A Little girl is enjoying cake outside.
62
+ - A yellow duck being held by a girl.
63
+ - source_sentence: A teenage girl in winter clothes slides down a decline in a red
64
+ sled.
65
  sentences:
66
+ - A woman preparing vegetables.
67
+ - A girl is sliding on a red sled.
68
+ - A person is on a beach.
69
+ - source_sentence: How many hymns of Luther were included in the Achtliederbuch?
70
  sentences:
71
+ - the ABC News building was renamed Peter Jennings Way in 2006 in honor of the recently
72
+ deceased longtime ABC News chief anchor and anchor of World News Tonight.
73
+ - In early 2009, Disney–ABC Television Group merged ABC Entertainment and ABC Studios
74
+ into a new division, ABC Entertainment Group, which would be responsible for both
75
+ its production and broadcasting operations.
76
+ - Luther's hymns were included in early Lutheran hymnals and spread the ideas of
77
+ the Reformation.
78
  pipeline_tag: sentence-similarity
79
+ model-index:
80
+ - name: SentenceTransformer based on microsoft/deberta-v3-small
81
+ results:
82
+ - task:
83
+ type: semantic-similarity
84
+ name: Semantic Similarity
85
+ dataset:
86
+ name: sts test
87
+ type: sts-test
88
+ metrics:
89
+ - type: pearson_cosine
90
+ value: 0.4121931859939639
91
+ name: Pearson Cosine
92
+ - type: spearman_cosine
93
+ value: 0.4188435395565816
94
+ name: Spearman Cosine
95
+ - type: pearson_manhattan
96
+ value: 0.43722674169112186
97
+ name: Pearson Manhattan
98
+ - type: spearman_manhattan
99
+ value: 0.4419489193187135
100
+ name: Spearman Manhattan
101
+ - type: pearson_euclidean
102
+ value: 0.4165228130620452
103
+ name: Pearson Euclidean
104
+ - type: spearman_euclidean
105
+ value: 0.42369527784158983
106
+ name: Spearman Euclidean
107
+ - type: pearson_dot
108
+ value: 0.13511926964573803
109
+ name: Pearson Dot
110
+ - type: spearman_dot
111
+ value: 0.13030376975519165
112
+ name: Spearman Dot
113
+ - type: pearson_max
114
+ value: 0.43722674169112186
115
+ name: Pearson Max
116
+ - type: spearman_max
117
+ value: 0.4419489193187135
118
+ name: Spearman Max
119
+ - type: pearson_cosine
120
+ value: 0.7746195773286169
121
+ name: Pearson Cosine
122
+ - type: spearman_cosine
123
+ value: 0.7690423402274569
124
+ name: Spearman Cosine
125
+ - type: pearson_manhattan
126
+ value: 0.7641811345210845
127
+ name: Pearson Manhattan
128
+ - type: spearman_manhattan
129
+ value: 0.754454714808573
130
+ name: Spearman Manhattan
131
+ - type: pearson_euclidean
132
+ value: 0.7621768998872902
133
+ name: Pearson Euclidean
134
+ - type: spearman_euclidean
135
+ value: 0.7522944339564277
136
+ name: Spearman Euclidean
137
+ - type: pearson_dot
138
+ value: 0.643272843908074
139
+ name: Pearson Dot
140
+ - type: spearman_dot
141
+ value: 0.6187202562345202
142
+ name: Spearman Dot
143
+ - type: pearson_max
144
+ value: 0.7746195773286169
145
+ name: Pearson Max
146
+ - type: spearman_max
147
+ value: 0.7690423402274569
148
+ name: Spearman Max
149
+ - type: pearson_cosine
150
+ value: 0.7408543477349779
151
+ name: Pearson Cosine
152
+ - type: spearman_cosine
153
+ value: 0.7193195268794856
154
+ name: Spearman Cosine
155
+ - type: pearson_manhattan
156
+ value: 0.7347205138738226
157
+ name: Pearson Manhattan
158
+ - type: spearman_manhattan
159
+ value: 0.716277121285963
160
+ name: Spearman Manhattan
161
+ - type: pearson_euclidean
162
+ value: 0.7317357204840789
163
+ name: Pearson Euclidean
164
+ - type: spearman_euclidean
165
+ value: 0.7133569462956698
166
+ name: Spearman Euclidean
167
+ - type: pearson_dot
168
+ value: 0.5412116736741877
169
+ name: Pearson Dot
170
+ - type: spearman_dot
171
+ value: 0.5324862690078268
172
+ name: Spearman Dot
173
+ - type: pearson_max
174
+ value: 0.7408543477349779
175
+ name: Pearson Max
176
+ - type: spearman_max
177
+ value: 0.7193195268794856
178
+ name: Spearman Max
179
+ - type: pearson_cosine
180
+ value: 0.7408543477349779
181
+ name: Pearson Cosine
182
+ - type: spearman_cosine
183
+ value: 0.7193195268794856
184
+ name: Spearman Cosine
185
+ - type: pearson_manhattan
186
+ value: 0.7347205138738226
187
+ name: Pearson Manhattan
188
+ - type: spearman_manhattan
189
+ value: 0.716277121285963
190
+ name: Spearman Manhattan
191
+ - type: pearson_euclidean
192
+ value: 0.7317357204840789
193
+ name: Pearson Euclidean
194
+ - type: spearman_euclidean
195
+ value: 0.7133569462956698
196
+ name: Spearman Euclidean
197
+ - type: pearson_dot
198
+ value: 0.5412116736741877
199
+ name: Pearson Dot
200
+ - type: spearman_dot
201
+ value: 0.5324862690078268
202
+ name: Spearman Dot
203
+ - type: pearson_max
204
+ value: 0.7408543477349779
205
+ name: Pearson Max
206
+ - type: spearman_max
207
+ value: 0.7193195268794856
208
+ name: Spearman Max
209
  ---
210
 
211
  # SentenceTransformer based on microsoft/deberta-v3-small
 
273
  model = SentenceTransformer("bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-AllSoft")
274
  # Run inference
275
  sentences = [
276
+ 'How many hymns of Luther were included in the Achtliederbuch?',
277
+ "Luther's hymns were included in early Lutheran hymnals and spread the ideas of the Reformation.",
278
+ 'the ABC News building was renamed Peter Jennings Way in 2006 in honor of the recently deceased longtime ABC News chief anchor and anchor of World News Tonight.',
279
  ]
280
  embeddings = model.encode(sentences)
281
  print(embeddings.shape)
 
311
  *List how the model may foreseeably be misused and address what users ought not to do with the model.*
312
  -->
313
 
314
+ ## Evaluation
315
+
316
+ ### Metrics
317
+
318
+ #### Semantic Similarity
319
+ * Dataset: `sts-test`
320
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
321
+
322
+ | Metric | Value |
323
+ |:--------------------|:-----------|
324
+ | pearson_cosine | 0.4122 |
325
+ | **spearman_cosine** | **0.4188** |
326
+ | pearson_manhattan | 0.4372 |
327
+ | spearman_manhattan | 0.4419 |
328
+ | pearson_euclidean | 0.4165 |
329
+ | spearman_euclidean | 0.4237 |
330
+ | pearson_dot | 0.1351 |
331
+ | spearman_dot | 0.1303 |
332
+ | pearson_max | 0.4372 |
333
+ | spearman_max | 0.4419 |
334
+
335
+ #### Semantic Similarity
336
+ * Dataset: `sts-test`
337
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
338
+
339
+ | Metric | Value |
340
+ |:--------------------|:----------|
341
+ | pearson_cosine | 0.7746 |
342
+ | **spearman_cosine** | **0.769** |
343
+ | pearson_manhattan | 0.7642 |
344
+ | spearman_manhattan | 0.7545 |
345
+ | pearson_euclidean | 0.7622 |
346
+ | spearman_euclidean | 0.7523 |
347
+ | pearson_dot | 0.6433 |
348
+ | spearman_dot | 0.6187 |
349
+ | pearson_max | 0.7746 |
350
+ | spearman_max | 0.769 |
351
+
352
+ #### Semantic Similarity
353
+ * Dataset: `sts-test`
354
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
355
+
356
+ | Metric | Value |
357
+ |:--------------------|:-----------|
358
+ | pearson_cosine | 0.7409 |
359
+ | **spearman_cosine** | **0.7193** |
360
+ | pearson_manhattan | 0.7347 |
361
+ | spearman_manhattan | 0.7163 |
362
+ | pearson_euclidean | 0.7317 |
363
+ | spearman_euclidean | 0.7134 |
364
+ | pearson_dot | 0.5412 |
365
+ | spearman_dot | 0.5325 |
366
+ | pearson_max | 0.7409 |
367
+ | spearman_max | 0.7193 |
368
+
369
+ #### Semantic Similarity
370
+ * Dataset: `sts-test`
371
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
372
+
373
+ | Metric | Value |
374
+ |:--------------------|:-----------|
375
+ | pearson_cosine | 0.7409 |
376
+ | **spearman_cosine** | **0.7193** |
377
+ | pearson_manhattan | 0.7347 |
378
+ | spearman_manhattan | 0.7163 |
379
+ | pearson_euclidean | 0.7317 |
380
+ | spearman_euclidean | 0.7134 |
381
+ | pearson_dot | 0.5412 |
382
+ | spearman_dot | 0.5325 |
383
+ | pearson_max | 0.7409 |
384
+ | spearman_max | 0.7193 |
385
+
386
  <!--
387
  ## Bias, Risks and Limitations
388
 
 
402
  #### nli-pairs
403
 
404
  * Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
405
+ * Size: 6,500 training samples
406
  * Columns: <code>sentence1</code> and <code>sentence2</code>
407
  * Approximate statistics based on the first 1000 samples:
408
  | | sentence1 | sentence2 |
 
454
  #### vitaminc-pairs
455
 
456
  * Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
457
+ * Size: 3,194 training samples
458
  * Columns: <code>label</code>, <code>sentence1</code>, and <code>sentence2</code>
459
  * Approximate statistics based on the first 1000 samples:
460
+ | | label | sentence1 | sentence2 |
461
+ |:--------|:-----------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
462
+ | type | int | string | string |
463
+ | details | <ul><li>1: 100.00%</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.76 tokens</li><li>max: 75 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 37.3 tokens</li><li>max: 502 tokens</li></ul> |
464
  * Samples:
465
+ | label | sentence1 | sentence2 |
466
+ |:---------------|:------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
467
+ | <code>1</code> | <code>The film will be screened in 2200 theaters .</code> | <code>In the United States and Canada , pre-release tracking suggest the film will gross $ 7�8 million from 2,200 theaters in its opening weekend , trailing fellow newcomer 10 Cloverfield Lane ( $ 25�30 million projection ) , but similar t</code> |
468
+ | <code>1</code> | <code>Neighbors 2 : Sorority Rising ( film ) scored over 65 % on Rotten Tomatoes .</code> | <code>On Rotten Tomatoes , the film has a rating of 67 % , based on 105 reviews , with an average rating of 5.9/10 .</code> |
469
+ | <code>1</code> | <code>Averaged on more than 65 reviews , The Handmaiden scored 94 % .</code> | <code>On Rotten Tomatoes , the film has a rating of 94 % , based on 67 reviews , with an average rating of 8/10 .</code> |
470
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
471
  ```json
472
  {
 
482
  #### qnli-contrastive
483
 
484
  * Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
485
+ * Size: 4,000 training samples
486
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
487
  * Approximate statistics based on the first 1000 samples:
488
+ | | sentence1 | sentence2 | label |
489
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
490
+ | type | string | string | int |
491
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.64 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 34.57 tokens</li><li>max: 149 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
492
  * Samples:
493
+ | sentence1 | sentence2 | label |
494
+ |:-----------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
495
+ | <code>What professors established the importance of Whitehead's work?</code> | <code>Professors such as Wieman, Charles Hartshorne, Bernard Loomer, Bernard Meland, and Daniel Day Williams made Whitehead's philosophy arguably the most important intellectual thread running through the Divinity School.</code> | <code>0</code> |
496
+ | <code>When did people start living on the edge of the desert?</code> | <code>It was long believed that the region had been this way since about 1600 BCE, after shifts in the Earth's axis increased temperatures and decreased precipitation.</code> | <code>0</code> |
497
+ | <code>What was the title of Gertrude Stein's 1906-1908 book?</code> | <code>Picasso in turn was an important influence on Stein's writing.</code> | <code>0</code> |
498
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
499
  ```json
500
  {
 
510
  #### scitail-pairs-qa
511
 
512
  * Dataset: [scitail-pairs-qa](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
513
+ * Size: 4,300 training samples
514
  * Columns: <code>sentence2</code> and <code>sentence1</code>
515
  * Approximate statistics based on the first 1000 samples:
516
+ | | sentence2 | sentence1 |
517
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
518
+ | type | string | string |
519
+ | details | <ul><li>min: 7 tokens</li><li>mean: 16.2 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 14.65 tokens</li><li>max: 33 tokens</li></ul> |
520
  * Samples:
521
+ | sentence2 | sentence1 |
522
+ |:-------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------|
523
+ | <code>Ash that enters the air naturally as a result of a volcano eruption is classified as a primary pollutant.</code> | <code>Ash that enters the air naturally as a result of a volcano eruption is classified as what kind of pollutant?</code> |
524
+ | <code>Exposure to ultraviolet radiation can increase the amount of pigment in the skin and make it appear darker.</code> | <code>Exposure to what can increase the amount of pigment in the skin and make it appear darker?</code> |
525
+ | <code>A lysozyme destroys bacteria by digesting their cell walls.</code> | <code>How does lysozyme destroy bacteria?</code> |
526
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
527
  ```json
528
  {
 
538
  #### scitail-pairs-pos
539
 
540
  * Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
541
+ * Size: 2,200 training samples
542
  * Columns: <code>sentence1</code> and <code>sentence2</code>
543
  * Approximate statistics based on the first 1000 samples:
544
+ | | sentence1 | sentence2 |
545
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
546
+ | type | string | string |
547
+ | details | <ul><li>min: 7 tokens</li><li>mean: 23.6 tokens</li><li>max: 74 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 15.23 tokens</li><li>max: 41 tokens</li></ul> |
548
  * Samples:
549
+ | sentence1 | sentence2 |
550
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------|
551
+ | <code>An atom that gains electrons would be a negative ion.</code> | <code>Atoms that have gained electrons and become negatively charged are called negative ions.</code> |
552
+ | <code>Scientists will use data collected during the collisions to explore the particles known as quarks and gluons that make up protons and neutrons.</code> | <code>Protons and neutrons are made of quarks, which are fundamental particles of matter.</code> |
553
+ | <code>Watersheds and divides All of the land area whose water drains into a stream system is called the system's watershed.</code> | <code>All of the land drained by a river system is called its basin, or the "wet" term watershed</code> |
554
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
555
  ```json
556
  {
 
566
  #### xsum-pairs
567
 
568
  * Dataset: [xsum-pairs](https://huggingface.co/datasets/sentence-transformers/xsum) at [788ddaf](https://huggingface.co/datasets/sentence-transformers/xsum/tree/788ddafe04e539956d56b567bc32a036ee7b9206)
569
+ * Size: 2,500 training samples
570
  * Columns: <code>sentence1</code> and <code>sentence2</code>
571
  * Approximate statistics based on the first 1000 samples:
572
+ | | sentence1 | sentence2 |
573
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
574
+ | type | string | string |
575
+ | details | <ul><li>min: 2 tokens</li><li>mean: 350.46 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 27.13 tokens</li><li>max: 70 tokens</li></ul> |
576
  * Samples:
577
+ | sentence1 | sentence2 |
578
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
579
+ | <code>An eyewitness told BBC Persian that the crowds were sharply divided between hardliners and moderates, but it was clear many people had responded to a call from former President Mohammad Khatami to attend the funeral as a show of support for the opposition reform movement.<br>Some were chanting opposition slogans, and others carried placards emphasising Mr Rafsanjani's links to the moderate and reformist camps.<br>"Long live Khatami, Long Live Rouhani. Hashemi, your soul is at peace!" said one banner.<br>"The circle became too closed for the centre," said another, using a quotation from Persian poetry to underline the growing distance in recent years between Mr Rafsanjani and Iran's hardline political establishment.<br>At one stage state television played loud music over its live broadcast of the event in order to drown out opposition slogans being chanted by the crowd.<br>As the official funeral eulogies were relayed to the crowds on the streets, they responded with calls of support for former President Khatami, and opposition leader Mir Hossein Mousavi, and shouts of: "You have the loudspeakers, we have the voice! Shame on you, Shame on State TV!"<br>On Iranian social media the funeral has been the number one topic with many opposition supporters using the hashtag #weallgathered to indicate their support and sympathy.<br>People have been posting photos and videos emphasising the number of opposition supporters out on the streets and showing the opposition slogans which state TV has been trying to obscure.<br>But government supporters have also taken to Twitter to play down the opposition showing at the funeral, accusing them of political opportunism.<br>"A huge army came out of love of the Supreme Leader," wrote a cleric called Sheikh Reza. "While a few foot soldiers came with their cameras to show off."<br>Another conversation engaging many on Twitter involved the wording of the prayers used at the funeral.<br>Did the Supreme Leader Ayatollah Ali Khamenei deliberately leave out a section praising the goodness of the deceased, some opposition supporters asked. And was this a comment on the political tensions between the two?<br>"No," responded another Twitter user, cleric Abbas Zolghadri. "The words of the prayer can be changed. There are no strict rules."<br>He followed this with a poignant photo of an empty grave - "Hashemi's final resting place" was the caption, summing up the sense of loss felt by Iranians of many different political persuasions despite the deep and bitter divisions.</code> | <code>Tehran has seen some of the biggest crowds on the streets since the 2009 "Green Movement" opposition demonstrations, as an estimated 2.5 million people gathered to bid farewell to Akbar Hashemi Rafsanjani, the man universally known as "Hashemi".</code> |
580
+ | <code>Mark Evans is retracing the same route across the Rub Al Khali, also known as the "Empty Quarter", taken by Bristol pioneer Bertram Thomas in 1930.<br>The 54-year-old Shropshire-born explorer is leading a three-man team to walk the 800 mile (1,300 km) journey from Salalah, Oman to Doha, Qatar.<br>The trek is expected to take 60 days.<br>The Rub Al Khali desert is considered one of the hottest, driest and most inhospitable places on earth.<br>Nearly two decades after Thomas completed his trek, British explorer and writer Sir Wilfred Thesiger crossed the Empty Quarter - mapping it in detail along the way.<br>60 days<br>To cross the Rub' Al Khali desert<br>* From Salalah in Oman to Doha, Qatar<br>* Walking with camels for 1,300km<br>* Area nearly three times the size of the UK<br>Completed by explorer Bertram Thomas in 1930<br>Bertram Thomas, who hailed from Pill, near Bristol, received telegrams of congratulation from both King George V and Sultan Taimur, then ruler of Oman.<br>He went on to lecture all over the world about the journey and to write a book called Arabia Felix.<br>Unlike Mr Evans, Thomas did not obtain permission for his expedition.<br>He said: "The biggest challenges for Thomas were warring tribes, lack of water in the waterholes and his total dependence on his Omani companion Sheikh Saleh to negotiate their way through the desert.<br>"The biggest challenge for those who wanted to make the crossing in recent decades has been obtaining government permissions to walk through this desolate and unknown territory."</code> | <code>An explorer has embarked on a challenge to become only the third British person in history to cross the largest sand desert in the world.</code> |
581
+ | <code>An Olympic gold medallist, he was also three-time world heavyweight champion and took part in some of the most memorable fights in boxing history.<br>He had a professional career spanning 21 years and BBC Sport takes a look at his 61 fights in more detail.</code> | <code>Boxing legend Muhammad Ali, who died at the age of 74, became a sporting icon during his career.</code> |
582
  * Loss: [<code>AdaptiveLayerLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#adaptivelayerloss) with these parameters:
583
  ```json
584
  {
 
594
  #### compression-pairs
595
 
596
  * Dataset: [compression-pairs](https://huggingface.co/datasets/sentence-transformers/sentence-compression) at [605bc91](https://huggingface.co/datasets/sentence-transformers/sentence-compression/tree/605bc91d95631895ba25b6eda51a3cb596976c90)
597
+ * Size: 4,000 training samples
598
  * Columns: <code>sentence1</code> and <code>sentence2</code>
599
  * Approximate statistics based on the first 1000 samples:
600
  | | sentence1 | sentence2 |
 
612
  {
613
  "loss": "MultipleNegativesSymmetricRankingLoss",
614
  "n_layers_per_step": -1,
615
+ "last_layer_weight": 1.5,
616
  "prior_layers_weight": 0.1,
617
  "kl_div_weight": 0.5,
618
  "kl_temperature": 1
 
622
  #### sciq_pairs
623
 
624
  * Dataset: [sciq_pairs](https://huggingface.co/datasets/allenai/sciq) at [2c94ad3](https://huggingface.co/datasets/allenai/sciq/tree/2c94ad3e1aafab77146f384e23536f97a4849815)
625
+ * Size: 6,500 training samples
626
  * Columns: <code>sentence1</code> and <code>sentence2</code>
627
  * Approximate statistics based on the first 1000 samples:
628
  | | sentence1 | sentence2 |
 
650
  #### qasc_pairs
651
 
652
  * Dataset: [qasc_pairs](https://huggingface.co/datasets/allenai/qasc) at [a34ba20](https://huggingface.co/datasets/allenai/qasc/tree/a34ba204eb9a33b919c10cc08f4f1c8dae5ec070)
653
+ * Size: 6,500 training samples
654
  * Columns: <code>id</code>, <code>sentence1</code>, and <code>sentence2</code>
655
  * Approximate statistics based on the first 1000 samples:
656
  | | id | sentence1 | sentence2 |
 
706
  #### msmarco_pairs
707
 
708
  * Dataset: [msmarco_pairs](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3) at [28ff31e](https://huggingface.co/datasets/sentence-transformers/msmarco-msmarco-distilbert-base-v3/tree/28ff31e4c97cddd53d298497f766e653f1e666f9)
709
+ * Size: 6,500 training samples
710
  * Columns: <code>sentence1</code> and <code>sentence2</code>
711
  * Approximate statistics based on the first 1000 samples:
712
  | | sentence1 | sentence2 |
 
734
  #### nq_pairs
735
 
736
  * Dataset: [nq_pairs](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
737
+ * Size: 6,500 training samples
738
  * Columns: <code>sentence1</code> and <code>sentence2</code>
739
  * Approximate statistics based on the first 1000 samples:
740
  | | sentence1 | sentence2 |
 
762
  #### trivia_pairs
763
 
764
  * Dataset: [trivia_pairs](https://huggingface.co/datasets/sentence-transformers/trivia-qa) at [a7c36e3](https://huggingface.co/datasets/sentence-transformers/trivia-qa/tree/a7c36e3c8c8c01526bc094d79bf80d4c848b0ad0)
765
+ * Size: 6,500 training samples
766
  * Columns: <code>sentence1</code> and <code>sentence2</code>
767
  * Approximate statistics based on the first 1000 samples:
768
  | | sentence1 | sentence2 |
 
790
  #### quora_pairs
791
 
792
  * Dataset: [quora_pairs](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
793
+ * Size: 4,000 training samples
794
  * Columns: <code>sentence1</code> and <code>sentence2</code>
795
  * Approximate statistics based on the first 1000 samples:
796
  | | sentence1 | sentence2 |
 
818
  #### gooaq_pairs
819
 
820
  * Dataset: [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
821
+ * Size: 6,500 training samples
822
  * Columns: <code>sentence1</code> and <code>sentence2</code>
823
  * Approximate statistics based on the first 1000 samples:
824
  | | sentence1 | sentence2 |
 
848
  #### nli-pairs
849
 
850
  * Dataset: [nli-pairs](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
851
+ * Size: 750 evaluation samples
852
  * Columns: <code>anchor</code> and <code>positive</code>
853
  * Approximate statistics based on the first 1000 samples:
854
  | | anchor | positive |
855
  |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
856
  | type | string | string |
857
+ | details | <ul><li>min: 5 tokens</li><li>mean: 17.61 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 9.71 tokens</li><li>max: 29 tokens</li></ul> |
858
  * Samples:
859
  | anchor | positive |
860
  |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|
 
876
  #### scitail-pairs-pos
877
 
878
  * Dataset: [scitail-pairs-pos](https://huggingface.co/datasets/allenai/scitail) at [0cc4353](https://huggingface.co/datasets/allenai/scitail/tree/0cc4353235b289165dfde1c7c5d1be983f99ce44)
879
+ * Size: 750 evaluation samples
880
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
881
  * Approximate statistics based on the first 1000 samples:
882
+ | | sentence1 | sentence2 | label |
883
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
884
+ | type | string | string | int |
885
+ | details | <ul><li>min: 5 tokens</li><li>mean: 22.43 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.3 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>0: ~50.00%</li><li>1: ~50.00%</li></ul> |
886
  * Samples:
887
  | sentence1 | sentence2 | label |
888
  |:----------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------|
 
904
  #### qnli-contrastive
905
 
906
  * Dataset: [qnli-contrastive](https://huggingface.co/datasets/nyu-mll/glue) at [bcdcba7](https://huggingface.co/datasets/nyu-mll/glue/tree/bcdcba79d07bc864c1c254ccfcedcce55bcc9a8c)
907
+ * Size: 750 evaluation samples
908
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
909
  * Approximate statistics based on the first 1000 samples:
910
  | | sentence1 | sentence2 | label |
911
  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------|
912
  | type | string | string | int |
913
+ | details | <ul><li>min: 6 tokens</li><li>mean: 14.15 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 36.98 tokens</li><li>max: 225 tokens</li></ul> | <ul><li>0: 100.00%</li></ul> |
914
  * Samples:
915
  | sentence1 | sentence2 | label |
916
  |:--------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
 
934
 
935
  - `eval_strategy`: steps
936
  - `per_device_train_batch_size`: 28
937
+ - `per_device_eval_batch_size`: 18
938
+ - `learning_rate`: 2e-05
939
  - `weight_decay`: 1e-06
940
+ - `num_train_epochs`: 2
941
  - `lr_scheduler_type`: cosine_with_restarts
942
  - `lr_scheduler_kwargs`: {'num_cycles': 3}
943
+ - `warmup_ratio`: 0.25
944
  - `save_safetensors`: False
945
  - `fp16`: True
946
  - `push_to_hub`: True
947
+ - `hub_model_id`: bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-2-checkpoints-tmp
948
  - `hub_strategy`: checkpoint
949
  - `batch_sampler`: no_duplicates
950
 
 
956
  - `eval_strategy`: steps
957
  - `prediction_loss_only`: True
958
  - `per_device_train_batch_size`: 28
959
+ - `per_device_eval_batch_size`: 18
960
  - `per_gpu_train_batch_size`: None
961
  - `per_gpu_eval_batch_size`: None
962
  - `gradient_accumulation_steps`: 1
963
  - `eval_accumulation_steps`: None
964
+ - `learning_rate`: 2e-05
965
  - `weight_decay`: 1e-06
966
  - `adam_beta1`: 0.9
967
  - `adam_beta2`: 0.999
968
  - `adam_epsilon`: 1e-08
969
  - `max_grad_norm`: 1.0
970
+ - `num_train_epochs`: 2
971
  - `max_steps`: -1
972
  - `lr_scheduler_type`: cosine_with_restarts
973
  - `lr_scheduler_kwargs`: {'num_cycles': 3}
974
+ - `warmup_ratio`: 0.25
975
  - `warmup_steps`: 0
976
  - `log_level`: passive
977
  - `log_level_replica`: warning
 
1030
  - `use_legacy_prediction_loop`: False
1031
  - `push_to_hub`: True
1032
  - `resume_from_checkpoint`: None
1033
+ - `hub_model_id`: bobox/DeBERTaV3-small-GeneralSentenceTransformer-v2-2-checkpoints-tmp
1034
  - `hub_strategy`: checkpoint
1035
  - `hub_private_repo`: False
1036
  - `hub_always_push`: False
 
1062
 
1063
  </details>
1064
 
1065
+ ### Training Logs
1066
+ <details><summary>Click to expand</summary>
1067
+
1068
+ | Epoch | Step | Training Loss | nli-pairs loss | qnli-contrastive loss | scitail-pairs-pos loss | sts-test_spearman_cosine |
1069
+ |:------:|:----:|:-------------:|:--------------:|:---------------------:|:----------------------:|:------------------------:|
1070
+ | 0 | 0 | - | - | - | - | 0.4188 |
1071
+ | 0.0253 | 71 | 9.7048 | - | - | - | - |
1072
+ | 0.0503 | 141 | - | 7.9860 | 8.4771 | 6.6165 | - |
1073
+ | 0.0507 | 142 | 8.6743 | - | - | - | - |
1074
+ | 0.0760 | 213 | 8.101 | - | - | - | - |
1075
+ | 0.1006 | 282 | - | 6.8505 | 7.5583 | 4.4099 | - |
1076
+ | 0.1014 | 284 | 7.5594 | - | - | - | - |
1077
+ | 0.1267 | 355 | 6.3548 | - | - | - | - |
1078
+ | 0.1510 | 423 | - | 5.2238 | 6.2964 | 2.3430 | - |
1079
+ | 0.1520 | 426 | 5.869 | - | - | - | - |
1080
+ | 0.1774 | 497 | 5.1134 | - | - | - | - |
1081
+ | 0.2013 | 564 | - | 4.5785 | 5.6786 | 1.8733 | - |
1082
+ | 0.2027 | 568 | 5.1262 | - | - | - | - |
1083
+ | 0.2281 | 639 | 3.7625 | - | - | - | - |
1084
+ | 0.2516 | 705 | - | 3.9531 | 5.1247 | 1.6374 | - |
1085
+ | 0.2534 | 710 | 4.5256 | - | - | - | - |
1086
+ | 0.2787 | 781 | 3.8572 | - | - | - | - |
1087
+ | 0.3019 | 846 | - | 3.5362 | 4.5487 | 1.5215 | - |
1088
+ | 0.3041 | 852 | 3.9294 | - | - | - | - |
1089
+ | 0.3294 | 923 | 3.281 | - | - | - | - |
1090
+ | 0.3522 | 987 | - | 3.1562 | 3.7942 | 1.4236 | - |
1091
+ | 0.3547 | 994 | 3.2531 | - | - | - | - |
1092
+ | 0.3801 | 1065 | 3.9305 | - | - | - | - |
1093
+ | 0.4026 | 1128 | - | 2.7059 | 3.4370 | 1.2689 | - |
1094
+ | 0.4054 | 1136 | 3.0324 | - | - | - | - |
1095
+ | 0.4308 | 1207 | 3.3544 | - | - | - | - |
1096
+ | 0.4529 | 1269 | - | 2.5396 | 3.0366 | 1.2415 | - |
1097
+ | 0.4561 | 1278 | 3.2331 | - | - | - | - |
1098
+ | 0.4814 | 1349 | 3.1913 | - | - | - | - |
1099
+ | 0.5032 | 1410 | - | 2.2846 | 2.7076 | 1.1422 | - |
1100
+ | 0.5068 | 1420 | 2.7389 | - | - | - | - |
1101
+ | 0.5321 | 1491 | 2.9541 | - | - | - | - |
1102
+ | 0.5535 | 1551 | - | 2.1732 | 2.3780 | 1.2127 | - |
1103
+ | 0.5575 | 1562 | 3.0911 | - | - | - | - |
1104
+ | 0.5828 | 1633 | 2.932 | - | - | - | - |
1105
+ | 0.6039 | 1692 | - | 2.0257 | 1.9252 | 1.1056 | - |
1106
+ | 0.6081 | 1704 | 3.082 | - | - | - | - |
1107
+ | 0.6335 | 1775 | 3.0328 | - | - | - | - |
1108
+ | 0.6542 | 1833 | - | 1.9588 | 2.0366 | 1.1187 | - |
1109
+ | 0.6588 | 1846 | 2.9508 | - | - | - | - |
1110
+ | 0.6842 | 1917 | 2.7445 | - | - | - | - |
1111
+ | 0.7045 | 1974 | - | 1.8310 | 1.9980 | 1.0991 | - |
1112
+ | 0.7095 | 1988 | 2.8922 | - | - | - | - |
1113
+ | 0.7348 | 2059 | 2.7352 | - | - | - | - |
1114
+ | 0.7548 | 2115 | - | 1.7650 | 1.5015 | 1.1103 | - |
1115
+ | 0.7602 | 2130 | 3.2009 | - | - | - | - |
1116
+ | 0.7855 | 2201 | 2.6261 | - | - | - | - |
1117
+ | 0.8051 | 2256 | - | 1.6932 | 1.6964 | 1.0409 | - |
1118
+ | 0.8108 | 2272 | 2.6623 | - | - | - | - |
1119
+ | 0.8362 | 2343 | 2.8281 | - | - | - | - |
1120
+ | 0.8555 | 2397 | - | 1.6844 | 1.7854 | 1.0300 | - |
1121
+ | 0.8615 | 2414 | 2.3096 | - | - | - | - |
1122
+ | 0.8869 | 2485 | 2.4088 | - | - | - | - |
1123
+ | 0.9058 | 2538 | - | 1.6698 | 1.8310 | 1.0275 | - |
1124
+ | 0.9122 | 2556 | 2.6051 | - | - | - | - |
1125
+ | 0.9375 | 2627 | 2.972 | - | - | - | - |
1126
+ | 0.9561 | 2679 | - | 1.6643 | 1.8173 | 1.0215 | - |
1127
+ | 0.9629 | 2698 | 2.4207 | - | - | - | - |
1128
+ | 0.9882 | 2769 | 2.2772 | - | - | - | - |
1129
+ | 1.0064 | 2820 | - | 1.7130 | 1.7650 | 1.0496 | - |
1130
+ | 1.0136 | 2840 | 2.6348 | - | - | - | - |
1131
+ | 1.0389 | 2911 | 2.8271 | - | - | - | - |
1132
+ | 1.0567 | 2961 | - | 1.6939 | 2.1074 | 0.9858 | - |
1133
+ | 1.0642 | 2982 | 2.5215 | - | - | - | - |
1134
+ | 1.0896 | 3053 | 2.7442 | - | - | - | - |
1135
+ | 1.1071 | 3102 | - | 1.6633 | 1.5590 | 0.9903 | - |
1136
+ | 1.1149 | 3124 | 2.6155 | - | - | - | - |
1137
+ | 1.1403 | 3195 | 2.7053 | - | - | - | - |
1138
+ | 1.1574 | 3243 | - | 1.6242 | 1.6429 | 0.9740 | - |
1139
+ | 1.1656 | 3266 | 2.9191 | - | - | - | - |
1140
+ | 1.1909 | 3337 | 2.1112 | - | - | - | - |
1141
+ | 1.2077 | 3384 | - | 1.6535 | 1.6226 | 0.9516 | - |
1142
+ | 1.2163 | 3408 | 2.3519 | - | - | - | - |
1143
+ | 1.2416 | 3479 | 1.9416 | - | - | - | - |
1144
+ | 1.2580 | 3525 | - | 1.6103 | 1.6530 | 0.9357 | - |
1145
+ | 1.2670 | 3550 | 2.0859 | - | - | - | - |
1146
+ | 1.2923 | 3621 | 2.0109 | - | - | - | - |
1147
+ | 1.3084 | 3666 | - | 1.5773 | 1.4672 | 0.9155 | - |
1148
+ | 1.3176 | 3692 | 2.366 | - | - | - | - |
1149
+ | 1.3430 | 3763 | 1.5532 | - | - | - | - |
1150
+ | 1.3587 | 3807 | - | 1.5514 | 1.4451 | 0.8979 | - |
1151
+ | 1.3683 | 3834 | 1.9982 | - | - | - | - |
1152
+ | 1.3936 | 3905 | 2.4375 | - | - | - | - |
1153
+ | 1.4090 | 3948 | - | 1.5254 | 1.4050 | 0.8834 | - |
1154
+ | 1.4190 | 3976 | 1.7548 | - | - | - | - |
1155
+ | 1.4443 | 4047 | 2.2272 | - | - | - | - |
1156
+ | 1.4593 | 4089 | - | 1.5186 | 1.3720 | 0.8835 | - |
1157
+ | 1.4697 | 4118 | 2.2145 | - | - | - | - |
1158
+ | 1.4950 | 4189 | 1.8696 | - | - | - | - |
1159
+ | 1.5096 | 4230 | - | 1.5696 | 1.0682 | 0.9336 | - |
1160
+ | 1.5203 | 4260 | 1.4926 | - | - | - | - |
1161
+ | 1.5457 | 4331 | 2.1193 | - | - | - | - |
1162
+ | 1.5600 | 4371 | - | 1.5469 | 0.8180 | 0.9663 | - |
1163
+ | 1.5710 | 4402 | 2.0298 | - | - | - | - |
1164
+ | 1.5964 | 4473 | 1.9959 | - | - | - | - |
1165
+ | 1.6103 | 4512 | - | 1.4656 | 1.1725 | 0.8815 | - |
1166
+ | 1.6217 | 4544 | 2.3452 | - | - | - | - |
1167
+ | 1.6470 | 4615 | 1.9529 | - | - | - | - |
1168
+ | 1.6606 | 4653 | - | 1.4709 | 1.1081 | 0.9079 | - |
1169
+ | 1.6724 | 4686 | 1.7932 | - | - | - | - |
1170
+ | 1.6977 | 4757 | 2.1881 | - | - | - | - |
1171
+ | 1.7109 | 4794 | - | 1.4526 | 0.9851 | 0.9167 | - |
1172
+ | 1.7231 | 4828 | 2.1128 | - | - | - | - |
1173
+ | 1.7484 | 4899 | 2.4772 | - | - | - | - |
1174
+ | 1.7612 | 4935 | - | 1.4204 | 0.8683 | 0.8896 | - |
1175
+ | 1.7737 | 4970 | 2.4336 | - | - | - | - |
1176
+ | 1.7991 | 5041 | 1.9101 | - | - | - | - |
1177
+ | 1.8116 | 5076 | - | 1.3821 | 1.0420 | 0.8538 | - |
1178
+ | 1.8244 | 5112 | 2.3882 | - | - | - | - |
1179
+ | 1.8498 | 5183 | 2.2165 | - | - | - | - |
1180
+ | 1.8619 | 5217 | - | 1.3747 | 1.0753 | 0.8580 | - |
1181
+ | 1.8751 | 5254 | 1.6554 | - | - | - | - |
1182
+ | 1.9004 | 5325 | 2.3828 | - | - | - | - |
1183
+ | 1.9122 | 5358 | - | 1.3637 | 1.0699 | 0.8557 | - |
1184
+ | 1.9258 | 5396 | 2.3499 | - | - | - | - |
1185
+ | 1.9511 | 5467 | 2.3972 | - | - | - | - |
1186
+ | 1.9625 | 5499 | - | 1.3583 | 1.0596 | 0.8536 | - |
1187
+ | 1.9764 | 5538 | 1.931 | - | - | - | - |
1188
+ | 2.0 | 5604 | - | 1.3586 | 1.0555 | 0.8543 | 0.7193 |
1189
+
1190
+ </details>
1191
+
1192
  ### Framework Versions
1193
  - Python: 3.10.13
1194
  - Sentence Transformers: 3.0.1
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e75f9f0d0ccf1ea68d57e5e49eadbe854516a7a239c28fe45742d13c727c0aae
3
- size 565251810
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52f8c3b52930fac8db5a5fe984c3799b56c40fd6ae38521081187798685d9cc5
3
+ size 480181136