avemio-digital commited on
Commit
d0c0c25
·
verified ·
1 Parent(s): 8e91ed7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -17
README.md CHANGED
@@ -5,7 +5,7 @@ tags:
5
  - sentence-similarity
6
  - feature-extraction
7
  base_model:
8
- - avemio/German_RAG-BGE-M3-TRIPLES-HESSIAN-AI
9
  - BAAI/bge-m3
10
  base_model_relation: merge
11
  widget:
@@ -16,14 +16,14 @@ widget:
16
  - 'search_query: i love autotrain'
17
  pipeline_tag: sentence-similarity
18
  datasets:
19
- - avemio/German_RAG-EMBEDDING-TRIPLES-HESSIAN-AI
20
  ---
21
 
22
- <img src="https://www.German_RAG.ai/wp-content/uploads/2024/12/German_RAG-ICON-TO-WORDLOGO-Animation_Loop-small-ezgif.com-video-to-gif-converter.gif" alt="German_RAG Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
23
 
24
- # German_RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI
25
 
26
- This is a [sentence-transformers](https://www.SBERT.net) model trained on this [Dataset](https://huggingface.co/datasets/avemio/German_RAG-Embedding-Triples-Hessian-AI) with roughly 300k Triple-Samples. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
27
  It was merged with the Base-Model [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) again to maintain performance on other languages again.
28
 
29
  ## Model Details
@@ -75,9 +75,9 @@ SentenceTransformer(
75
  ### STS (Semantic Textual Similarity)
76
  - GermanSTSBenchmark
77
 
78
- #### Comparison between Base-Model ([BGE-M3](https://huggingface.co/BAAI/bge-m3)), Finetuned Model ([German_RAG-BGE](https://huggingface.co/avemio/German_RAG-BGE-M3-TRIPLES-HESSIAN-AI)) and Merged Model with Base-Model ([Merged-BGE](https://huggingface.co/avemio/German_RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/))
79
 
80
- | TASK | [BGE-M3](https://huggingface.co/BAAI/bge-m3) | [German_RAG-BGE](https://huggingface.co/avemio/German_RAG-BGE-M3-TRIPLES-HESSIAN-AI) | Merged-BGE | German_RAG vs. BGE | Merged vs. BGE |
81
  |-------------------------------------|-------|----------|------------|--------------|----------------|
82
  | AmazonCounterfactualClassification | 0.6908 | 0.5449 | **0.7111** | -14.59% | 2.03% |
83
  | AmazonReviewsClassification | **0.4634** | 0.2745 | 0.4571 | -18.89% | -0.63% |
@@ -91,9 +91,9 @@ SentenceTransformer(
91
  | MTOPIntentClassification | **0.6808** | 0.4516 | 0.6684 | -22.92% | -1.25% |
92
  | PawsXPairClassification | 0.5678 | 0.5077 | **0.5710** | -6.01% | 0.33% |
93
 
94
- #### Comparison between Base-Model ([BGE-M3](https://huggingface.co/BAAI/bge-m3)), Merged Model with Base-Model ([Merged-BGE](https://huggingface.co/avemio/German_RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/)) and our Merged-Model merged with [Snowflake/snowflake-arctic-embed-l-v2.0](https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0)
95
 
96
- | TASK | [BGE-M3](https://huggingface.co/BAAI/bge-m3) | Merged-BGE | [Merged-Snowflake](https://huggingface.co/avemio/German_RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI/) | Merged-BGE vs. BGE | Merged-Snowflake vs. BGE | Merged-Snowflake vs. Merged-BGE |
97
  |-------------------------------------|-------|------------|------------------|--------------------|--------------------------|---------------------------------|
98
  | AmazonCounterfactualClassification | 0.6908 | 0.7111 | **0.7152** | 2.94% | 3.53% | 0.58% |
99
  | AmazonReviewsClassification | **0.4634** | 0.4571 | 0.4577 | -1.36% | -1.23% | 0.13% |
@@ -108,20 +108,20 @@ SentenceTransformer(
108
  | PawsXPairClassification | 0.5678 | 0.5710 | **0.5803** | 0.56% | 2.18% | 1.63% |
109
 
110
 
111
- ## Evaluation on German_RAG-EMBEDDING-BENCHMARK
112
 
113
  Accuracy is calculated by evaluating if the relevant context is the highest ranking embedding of the whole context array.
114
- See Eval-Dataset and Evaluation Code [here](https://huggingface.co/datasets/avemio/German_RAG-EMBEDDING-BENCHMARK)
115
 
116
  | Model Name | Accuracy |
117
  |-------------------------------------------------|-----------|
118
  | [bge-m3](https://huggingface.co/BAAI/bge-m3 ) | 0.8806 |
119
  | [UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1) | 0.8393 |
120
- | [German_RAG-BGE-M3-TRIPLES-HESSIAN-AI](https://huggingface.co/avemio/German_RAG-BGE-M3-TRIPLES-HESSIAN-AI) | 0.8857 |
121
- | [German_RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI](https://huggingface.co/avemio/German_RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI) | **0.8866** |
122
- | [German_RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI](https://huggingface.co/avemio/German_RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI) | **0.8866** |
123
- | [German_RAG-UAE-LARGE-V1-TRIPLES-HESSIAN-AI](https://huggingface.co/avemio/German_RAG-UAE-LARGE-V1-TRIPLES-HESSIAN-AI) | 0.8763 |
124
- | [German_RAG-UAE-LARGE-V1-TRIPLES-MERGED-HESSIAN-AI](https://huggingface.co/avemio/German_RAG-UAE-LARGE-V1-TRIPLES-MERGED-HESSIAN-AI) | 0.8771 |
125
 
126
  ## Usage
127
 
@@ -138,7 +138,7 @@ Then you can load this model and run inference.
138
  from sentence_transformers import SentenceTransformer
139
 
140
  # Download from the 🤗 Hub
141
- model = SentenceTransformer("avemio/German_RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI")
142
  # Run inference
143
  sentences = [
144
  'The weather is lovely today.',
 
5
  - sentence-similarity
6
  - feature-extraction
7
  base_model:
8
+ - avemio/German-RAG-BGE-M3-TRIPLES-HESSIAN-AI
9
  - BAAI/bge-m3
10
  base_model_relation: merge
11
  widget:
 
16
  - 'search_query: i love autotrain'
17
  pipeline_tag: sentence-similarity
18
  datasets:
19
+ - avemio/German-RAG-EMBEDDING-TRIPLES-HESSIAN-AI
20
  ---
21
 
22
+ <img src="https://www.German-RAG.ai/wp-content/uploads/2024/12/German-RAG-ICON-TO-WORDLOGO-Animation_Loop-small-ezgif.com-video-to-gif-converter.gif" alt="German-RAG Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
23
 
24
+ # German-RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI
25
 
26
+ This is a [sentence-transformers](https://www.SBERT.net) model trained on this [Dataset](https://huggingface.co/datasets/avemio/German-RAG-Embedding-Triples-Hessian-AI) with roughly 300k Triple-Samples. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
27
  It was merged with the Base-Model [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) again to maintain performance on other languages again.
28
 
29
  ## Model Details
 
75
  ### STS (Semantic Textual Similarity)
76
  - GermanSTSBenchmark
77
 
78
+ #### Comparison between Base-Model ([BGE-M3](https://huggingface.co/BAAI/bge-m3)), Finetuned Model ([German-RAG-BGE](https://huggingface.co/avemio/German-RAG-BGE-M3-TRIPLES-HESSIAN-AI)) and Merged Model with Base-Model ([Merged-BGE](https://huggingface.co/avemio/German-RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/))
79
 
80
+ | TASK | [BGE-M3](https://huggingface.co/BAAI/bge-m3) | [German-RAG-BGE](https://huggingface.co/avemio/German-RAG-BGE-M3-TRIPLES-HESSIAN-AI) | Merged-BGE | German-RAG vs. BGE | Merged vs. BGE |
81
  |-------------------------------------|-------|----------|------------|--------------|----------------|
82
  | AmazonCounterfactualClassification | 0.6908 | 0.5449 | **0.7111** | -14.59% | 2.03% |
83
  | AmazonReviewsClassification | **0.4634** | 0.2745 | 0.4571 | -18.89% | -0.63% |
 
91
  | MTOPIntentClassification | **0.6808** | 0.4516 | 0.6684 | -22.92% | -1.25% |
92
  | PawsXPairClassification | 0.5678 | 0.5077 | **0.5710** | -6.01% | 0.33% |
93
 
94
+ #### Comparison between Base-Model ([BGE-M3](https://huggingface.co/BAAI/bge-m3)), Merged Model with Base-Model ([Merged-BGE](https://huggingface.co/avemio/German-RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/)) and our Merged-Model merged with [Snowflake/snowflake-arctic-embed-l-v2.0](https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0)
95
 
96
+ | TASK | [BGE-M3](https://huggingface.co/BAAI/bge-m3) | Merged-BGE | [Merged-Snowflake](https://huggingface.co/avemio/German-RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI/) | Merged-BGE vs. BGE | Merged-Snowflake vs. BGE | Merged-Snowflake vs. Merged-BGE |
97
  |-------------------------------------|-------|------------|------------------|--------------------|--------------------------|---------------------------------|
98
  | AmazonCounterfactualClassification | 0.6908 | 0.7111 | **0.7152** | 2.94% | 3.53% | 0.58% |
99
  | AmazonReviewsClassification | **0.4634** | 0.4571 | 0.4577 | -1.36% | -1.23% | 0.13% |
 
108
  | PawsXPairClassification | 0.5678 | 0.5710 | **0.5803** | 0.56% | 2.18% | 1.63% |
109
 
110
 
111
+ ## Evaluation on German-RAG-EMBEDDING-BENCHMARK
112
 
113
  Accuracy is calculated by evaluating if the relevant context is the highest ranking embedding of the whole context array.
114
+ See Eval-Dataset and Evaluation Code [here](https://huggingface.co/datasets/avemio/German-RAG-EMBEDDING-BENCHMARK)
115
 
116
  | Model Name | Accuracy |
117
  |-------------------------------------------------|-----------|
118
  | [bge-m3](https://huggingface.co/BAAI/bge-m3 ) | 0.8806 |
119
  | [UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1) | 0.8393 |
120
+ | [German-RAG-BGE-M3-TRIPLES-HESSIAN-AI](https://huggingface.co/avemio/German-RAG-BGE-M3-TRIPLES-HESSIAN-AI) | 0.8857 |
121
+ | [German-RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI](https://huggingface.co/avemio/German-RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI) | **0.8866** |
122
+ | [German-RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI](https://huggingface.co/avemio/German-RAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI) | **0.8866** |
123
+ | [German-RAG-UAE-LARGE-V1-TRIPLES-HESSIAN-AI](https://huggingface.co/avemio/German-RAG-UAE-LARGE-V1-TRIPLES-HESSIAN-AI) | 0.8763 |
124
+ | [German-RAG-UAE-LARGE-V1-TRIPLES-MERGED-HESSIAN-AI](https://huggingface.co/avemio/German-RAG-UAE-LARGE-V1-TRIPLES-MERGED-HESSIAN-AI) | 0.8771 |
125
 
126
  ## Usage
127
 
 
138
  from sentence_transformers import SentenceTransformer
139
 
140
  # Download from the 🤗 Hub
141
+ model = SentenceTransformer("avemio/German-RAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI")
142
  # Run inference
143
  sentences = [
144
  'The weather is lovely today.',