tomaarsen
/

st-v3-test-mpnet-base-allnli-stsb

@@ -9,48 +9,133 @@ tags:
 - generated
 base_model: microsoft/mpnet-base
 metrics:
-- accuracy
 widget:
-- source_sentence: Many youth are lazy.
   sentences:
-  - Lincoln took his hat off.
   - At the end of the fourth century was when baked goods flourished.
-  - DOD's common practice for managing this environment has been to create aggressive
-    risk reduction efforts in its programs.
-- source_sentence: a guy on a bike
   sentences:
-  - A man is on a bike.
-  - two men sit in a train car
-  - She is the boy's aunt.
-- source_sentence: The dog is wet.
   sentences:
-  - A child and small dog running.
-  - The man is riding a sheep.
-  - The man is doing a bike trick.
 - source_sentence: yeah really no kidding
   sentences:
-  - 'Really? No kidding! '
   - yeah i mean just when uh the they military paid for her education
-  - Changes were made to the Grant Renewal Application to provide extra information
-    to the LSC.
-- source_sentence: 'Harlem did a great job '
   sentences:
-  - 'Missouri was happy to continue it''s planning efforts. '
   - yeah i mean just when uh the they military paid for her education
-  - I know exactly.
 pipeline_tag: sentence-similarity
 co2_eq_emissions:
-  emissions: 18.165192544667764
   source: codecarbon
   training_type: fine-tuning
   on_cloud: false
   cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
   ram_total_size: 31.777088165283203
-  hours_used: 0.141
   hardware_used: 1 x NVIDIA GeForce RTX 3090
 ---
-# SentenceTransformer
 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli), [snli](https://huggingface.co/datasets/stanfordnlp/snli) and [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
@@ -98,11 +183,11 @@ Then you can load this model and run inference.
 from sentence_transformers import SentenceTransformer
 # Download from the 🤗 Hub
-model = SentenceTransformer("tomaarsen/st-v3-test-mpnet-base-allnli-stsb")
 # Run inference
 sentences = [
-    "Harlem did a great job ",
-    "Missouri was happy to continue it's planning efforts. ",
     "yeah i mean just when uh the they military paid for her education",
 ]
 embeddings = model.encode(sentences)
@@ -134,6 +219,44 @@ You can finetune this model on your own dataset.
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
 <!--
 ## Bias, Risks and Limitations
@@ -391,34 +514,34 @@ You can finetune this model on your own dataset.
 </details>
 ### Training Logs
-| Epoch  | Step | Training Loss | multi_nli | snli   | stsb   |
-|:------:|:----:|:-------------:|:---------:|:------:|:------:|
-| 0.0493 | 10   | 0.9204        | 1.0998    | 1.1022 | 0.2997 |
-| 0.0985 | 20   | 1.0074        | 1.0983    | 1.0971 | 0.2499 |
-| 0.1478 | 30   | 1.0037        | 1.0994    | 1.0939 | 0.1667 |
-| 0.1970 | 40   | 0.7961        | 1.0945    | 1.0877 | 0.0814 |
-| 0.2463 | 50   | 0.9882        | 1.0950    | 1.0806 | 0.0840 |
-| 0.2956 | 60   | 0.7814        | 1.0873    | 1.0711 | 0.0681 |
-| 0.3448 | 70   | 0.6678        | 1.0829    | 1.0673 | 0.0504 |
-| 0.3941 | 80   | 0.7669        | 1.0771    | 1.0638 | 0.0501 |
-| 0.4433 | 90   | 0.9718        | 1.0704    | 1.0517 | 0.0482 |
-| 0.4926 | 100  | 0.8494        | 1.0609    | 1.0388 | 0.0526 |
-| 0.5419 | 110  | 0.745         | 1.0631    | 1.0285 | 0.0527 |
-| 0.5911 | 120  | 0.6416        | 1.0564    | 1.0148 | 0.0588 |
-| 0.6404 | 130  | 1.0331        | 1.0504    | 1.0026 | 0.0627 |
-| 0.6897 | 140  | 0.8305        | 1.0417    | 1.0023 | 0.0664 |
-| 0.7389 | 150  | 0.7362        | 1.0282    | 0.9937 | 0.0672 |
-| 0.7882 | 160  | 0.7164        | 1.0288    | 0.9930 | 0.0688 |
-| 0.8374 | 170  | 0.8217        | 1.0264    | 0.9819 | 0.0677 |
-| 0.8867 | 180  | 0.9046        | 1.0200    | 0.9734 | 0.0742 |
-| 0.9360 | 190  | 0.5327        | 1.0221    | 0.9764 | 0.0698 |
-| 0.9852 | 200  | 0.8974        | 1.0233    | 0.9776 | 0.0691 |
 ### Environmental Impact
 Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
 - **Carbon Emitted**: 0.018 kg of CO2
-- **Hours Used**: 0.141 hours
 ### Training Hardware
 - **On Cloud**: No
@@ -438,7 +561,8 @@ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codec
 ## Citation
 ### BibTeX
-#### Sentence Transformers
 ```bibtex
 @inproceedings{reimers-2019-sentence-bert,
     title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",

 - generated
 base_model: microsoft/mpnet-base
 metrics:
+- pearson_cosine
+- spearman_cosine
+- pearson_manhattan
+- spearman_manhattan
+- pearson_euclidean
+- spearman_euclidean
+- pearson_dot
+- spearman_dot
+- pearson_max
+- spearman_max
 widget:
+- source_sentence: 'Really? No kidding! '
   sentences:
+  - yeah really no kidding
   - At the end of the fourth century was when baked goods flourished.
+  - The campaigns seem to reach a new pool of contributors.
+- source_sentence: A sleeping man.
   sentences:
+  - Two men are sleeping.
+  - Someone is selling oranges
+  - the family is young
+- source_sentence: a guy on a bike
   sentences:
+  - A tall person on a bike
+  - A man is on a frozen lake.
+  - The women throw food at the kids
 - source_sentence: yeah really no kidding
   sentences:
+  - oh uh-huh well no they wouldn't would they no
   - yeah i mean just when uh the they military paid for her education
+  - The campaigns seem to reach a new pool of contributors.
+- source_sentence: He ran like an athlete.
   sentences:
+  - ' Then he ran.'
   - yeah i mean just when uh the they military paid for her education
+  - Similarly, OIM revised the electronic Grant Renewal Application to accommodate
+    new information sought by LSC and to ensure greater ease for users.
 pipeline_tag: sentence-similarity
 co2_eq_emissions:
+  emissions: 17.515467907816664
   source: codecarbon
   training_type: fine-tuning
   on_cloud: false
   cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
   ram_total_size: 31.777088165283203
+  hours_used: 0.13
   hardware_used: 1 x NVIDIA GeForce RTX 3090
+model-index:
+- name: SentenceTransformer based on microsoft/mpnet-base
+  results:
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: sts dev
+      type: sts-dev
+    metrics:
+    - type: pearson_cosine
+      value: 0.7331234146933103
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: 0.7435439430716654
+      name: Spearman Cosine
+    - type: pearson_manhattan
+      value: 0.7389474504545281
+      name: Pearson Manhattan
+    - type: spearman_manhattan
+      value: 0.7473580293303098
+      name: Spearman Manhattan
+    - type: pearson_euclidean
+      value: 0.7356264396007131
+      name: Pearson Euclidean
+    - type: spearman_euclidean
+      value: 0.7436137284782617
+      name: Spearman Euclidean
+    - type: pearson_dot
+      value: 0.7093073700072118
+      name: Pearson Dot
+    - type: spearman_dot
+      value: 0.7150453113301433
+      name: Spearman Dot
+    - type: pearson_max
+      value: 0.7389474504545281
+      name: Pearson Max
+    - type: spearman_max
+      value: 0.7473580293303098
+      name: Spearman Max
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: sts test
+      type: sts-test
+    metrics:
+    - type: pearson_cosine
+      value: 0.6750510843835755
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: 0.6615639695746663
+      name: Spearman Cosine
+    - type: pearson_manhattan
+      value: 0.6718085205234632
+      name: Pearson Manhattan
+    - type: spearman_manhattan
+      value: 0.6589482932175834
+      name: Spearman Manhattan
+    - type: pearson_euclidean
+      value: 0.6693170762111229
+      name: Pearson Euclidean
+    - type: spearman_euclidean
+      value: 0.6578210069410166
+      name: Spearman Euclidean
+    - type: pearson_dot
+      value: 0.6490291380804283
+      name: Pearson Dot
+    - type: spearman_dot
+      value: 0.6335192601696299
+      name: Spearman Dot
+    - type: pearson_max
+      value: 0.6750510843835755
+      name: Pearson Max
+    - type: spearman_max
+      value: 0.6615639695746663
+      name: Spearman Max
 ---
+# SentenceTransformer based on microsoft/mpnet-base
 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) on the [multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli), [snli](https://huggingface.co/datasets/stanfordnlp/snli) and [stsb](https://huggingface.co/datasets/mteb/stsbenchmark-sts) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 from sentence_transformers import SentenceTransformer
 # Download from the 🤗 Hub
+model = SentenceTransformer("sentence_transformers_model_id")
 # Run inference
 sentences = [
+    "He ran like an athlete.",
+    " Then he ran.",
     "yeah i mean just when uh the they military paid for her education",
 ]
 embeddings = model.encode(sentences)
 *List how the model may foreseeably be misused and address what users ought not to do with the model.*
 -->
+## Evaluation
+### Metrics
+#### Semantic Similarity
+* Dataset: `sts-dev`
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| pearson_cosine      | 0.7331     |
+| **spearman_cosine** | **0.7435** |
+| pearson_manhattan   | 0.7389     |
+| spearman_manhattan  | 0.7474     |
+| pearson_euclidean   | 0.7356     |
+| spearman_euclidean  | 0.7436     |
+| pearson_dot         | 0.7093     |
+| spearman_dot        | 0.715      |
+| pearson_max         | 0.7389     |
+| spearman_max        | 0.7474     |
+#### Semantic Similarity
+* Dataset: `sts-test`
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| pearson_cosine      | 0.6751     |
+| **spearman_cosine** | **0.6616** |
+| pearson_manhattan   | 0.6718     |
+| spearman_manhattan  | 0.6589     |
+| pearson_euclidean   | 0.6693     |
+| spearman_euclidean  | 0.6578     |
+| pearson_dot         | 0.649      |
+| spearman_dot        | 0.6335     |
+| pearson_max         | 0.6751     |
+| spearman_max        | 0.6616     |
 <!--
 ## Bias, Risks and Limitations
 </details>
 ### Training Logs
+| Epoch  | Step | Training Loss | multi nli loss | snli loss | stsb loss | sts-dev spearman cosine |
+|:------:|:----:|:-------------:|:--------------:|:---------:|:---------:|:-----------------------:|
+| 0.0493 | 10   | 0.9199        | 1.1019         | 1.1017    | 0.3016    | 0.6324                  |
+| 0.0985 | 20   | 1.0063        | 1.1000         | 1.0966    | 0.2635    | 0.6093                  |
+| 0.1478 | 30   | 1.002         | 1.0995         | 1.0908    | 0.1766    | 0.5328                  |
+| 0.1970 | 40   | 0.7946        | 1.0980         | 1.0913    | 0.0923    | 0.5991                  |
+| 0.2463 | 50   | 0.9891        | 1.0967         | 1.0781    | 0.0912    | 0.6457                  |
+| 0.2956 | 60   | 0.784         | 1.0938         | 1.0699    | 0.0934    | 0.6629                  |
+| 0.3448 | 70   | 0.6735        | 1.0940         | 1.0728    | 0.0640    | 0.7538                  |
+| 0.3941 | 80   | 0.7713        | 1.0893         | 1.0676    | 0.0612    | 0.7653                  |
+| 0.4433 | 90   | 0.9772        | 1.0870         | 1.0573    | 0.0636    | 0.7621                  |
+| 0.4926 | 100  | 0.8613        | 1.0862         | 1.0515    | 0.0632    | 0.7583                  |
+| 0.5419 | 110  | 0.7528        | 1.0814         | 1.0397    | 0.0617    | 0.7536                  |
+| 0.5911 | 120  | 0.6541        | 1.0854         | 1.0329    | 0.0657    | 0.7512                  |
+| 0.6404 | 130  | 1.051         | 1.0658         | 1.0211    | 0.0607    | 0.7340                  |
+| 0.6897 | 140  | 0.8516        | 1.0631         | 1.0171    | 0.0587    | 0.7467                  |
+| 0.7389 | 150  | 0.7484        | 1.0563         | 1.0122    | 0.0556    | 0.7537                  |
+| 0.7882 | 160  | 0.7368        | 1.0534         | 1.0100    | 0.0588    | 0.7526                  |
+| 0.8374 | 170  | 0.8373        | 1.0498         | 1.0030    | 0.0565    | 0.7491                  |
+| 0.8867 | 180  | 0.9311        | 1.0387         | 0.9981    | 0.0588    | 0.7302                  |
+| 0.9360 | 190  | 0.5445        | 1.0357         | 0.9967    | 0.0565    | 0.7382                  |
+| 0.9852 | 200  | 0.9154        | 1.0359         | 0.9964    | 0.0556    | 0.7435                  |
 ### Environmental Impact
 Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
 - **Carbon Emitted**: 0.018 kg of CO2
+- **Hours Used**: 0.13 hours
 ### Training Hardware
 - **On Cloud**: No
 ## Citation
 ### BibTeX
+#### Sentence Transformers and SoftmaxLoss
 ```bibtex
 @inproceedings{reimers-2019-sentence-bert,
     title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",