Netta1994 commited on
Commit
f9e78ae
1 Parent(s): afa730e

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,276 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: 'Reasoning:
14
+
15
+ 1. **Context Grounding**: The answer is well-supported by the provided document
16
+ and includes specific details that align with Haribabu Kommi''s email.
17
+
18
+ 2. **Relevance**: The answer directly addresses the question by listing the changes
19
+ being made to the storage AM as mentioned in the email.
20
+
21
+ 3. **Conciseness**: The answer is clear and to the point, avoiding unnecessary
22
+ information.
23
+
24
+
25
+ The initial statement captures Haribabu Kommi''s main points, and the follow-up
26
+ details provide the exact changes and enhance the completeness without deviating
27
+ from the topic.
28
+
29
+
30
+ Final Result:'
31
+ - text: 'Reasoning:
32
+
33
+ 1. **Context Grounding**: The answer accurately identifies Ning Zhongyan as the
34
+ gold medalist in the men''s 1,500m final at the speed skating World Cup. This
35
+ information matches the provided document where it is explicitly mentioned.
36
+
37
+ 2. **Relevance**: The answer is directly relevant to the question asked, providing
38
+ the required information without straying into unrelated details.
39
+
40
+ 3. **Conciseness**: The answer is concise and to the point, only mentioning the
41
+ necessary details about the winner and the event.
42
+
43
+
44
+ Final Result:'
45
+ - text: 'Reasoning:
46
+
47
+
48
+ 1. Context Grounding: The answer provided is well-supported by the provided document,
49
+ as it correctly lists the sizes specified in the text for both individual and
50
+ combined portraits.
51
+
52
+ 2. Relevance: The answer is directly related to the question, addressing the specific
53
+ sizes for the individual and combined portraits without straying into unrelated
54
+ information.
55
+
56
+ 3. Conciseness: The answer is clear and to the point, sticking strictly to the
57
+ sizes without adding unnecessary details.
58
+
59
+
60
+ Final Result:'
61
+ - text: 'Reasoning:
62
+
63
+ 1. Context Grounding: The answer accurately describes the components of the Student
64
+ Guide, which is well-supported by the provided document.
65
+
66
+ 2. Relevance: The answer directly addresses the question by listing the components
67
+ of the British Medieval Student Guide.
68
+
69
+ 3. Conciseness: The answer is concise and includes only the necessary details
70
+ regarding the components of the guide without extraneous information.
71
+
72
+
73
+ Final Result:'
74
+ - text: 'Reasoning:
75
+
76
+ 1. **Context Grounding**: The document explicitly names the first three Members
77
+ of Congress as Reps. Keith Ellison, Barbara Lee, and Danny Davis. The answer provided
78
+ refers to Rep. Andy Harris, Reps. Kyle Evans, and Jessica Smith, which does not
79
+ align with the information in the document.
80
+
81
+ 2. **Relevance**: The answer does not correctly address the question based on
82
+ the information provided in the document.
83
+
84
+ 3. **Conciseness**: Although the given answer is concise, it is incorrect as it
85
+ names individuals who are not mentioned in the provided document.
86
+
87
+
88
+ Final Result:'
89
+ inference: true
90
+ model-index:
91
+ - name: SetFit with BAAI/bge-base-en-v1.5
92
+ results:
93
+ - task:
94
+ type: text-classification
95
+ name: Text Classification
96
+ dataset:
97
+ name: Unknown
98
+ type: unknown
99
+ split: test
100
+ metrics:
101
+ - type: accuracy
102
+ value: 0.88
103
+ name: Accuracy
104
+ ---
105
+
106
+ # SetFit with BAAI/bge-base-en-v1.5
107
+
108
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
109
+
110
+ The model has been trained using an efficient few-shot learning technique that involves:
111
+
112
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
113
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
114
+
115
+ ## Model Details
116
+
117
+ ### Model Description
118
+ - **Model Type:** SetFit
119
+ - **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
120
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
121
+ - **Maximum Sequence Length:** 512 tokens
122
+ - **Number of Classes:** 2 classes
123
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
124
+ <!-- - **Language:** Unknown -->
125
+ <!-- - **License:** Unknown -->
126
+
127
+ ### Model Sources
128
+
129
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
130
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
131
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
132
+
133
+ ### Model Labels
134
+ | Label | Examples |
135
+ |:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
136
+ | 0 | <ul><li>"Reasoning:\n1. Context Grounding: The answer aligns well with the provided document, specifically discussing coach Brian Shaw's influence and changes in the team strategy, which are mentioned in the text.\n2. Relevance: The response directly addresses the question by focusing on the reasons behind the Nuggets' offensive success in January, such as the new gameplay strategy advocated by the coach and increased comfort and effectiveness.\n3. Conciseness: The answer is mostly concise but adds an unsubstantiated point about virtual reality training, which is not mentioned in the document and should be excluded to maintain briefing relevance.\n\nFinal result: ****."</li><li>"Reasoning:\n1. Context Grounding: The answer effectively uses specific details from the provided document, discussing the author's experience with digital and film photography, and technical differences such as how each medium handles exposure and color capture.\n2. Relevance: The answer is directly relevant to the question, enumerating specific differences mentioned by the author.\n3. Conciseness: While mostly concise, the answer could have been slightly more succinct. However, it largely avoids unnecessary information and remains clear and to the point.\n\nFinal Result:"</li><li>"Reasoning:\n\n1. **Context Grounding:** The answer given details the results of a mixed martial arts event, specifically highlighting Antonio Rogerio Nogueira's victory. However, the question asks about the main conflict in the third book of the Arcana Chronicles by Kresley Cole. There is no relevance in the provided document or the answer to the Arcana Chronicles.\n2. **Relevance:** The answer does not address the asked question at all. Instead, it provides information about an MMA fight, which is entirely unrelated to the Arcana Chronicles.\n3. **Conciseness:** While the answer is concise, it fails to answer the appropriate question, thus making its conciseness irrelevant in this context.\n\nFinal Result:"</li></ul> |
137
+ | 1 | <ul><li>'Reasoning:\n\n1. Context Grounding: The answer provided is well-supported by the document and grounded in the text, which discusses best practices for web designers to avoid unnecessary revisions and conflicts. It specifically addresses parts of the document that highlight getting to know the client, signing a contract, and being honest and diplomatic.\n \n2. Relevance: The answer directly addresses the question of best practices a web designer can incorporate into their client discovery and web design process. It does not deviate into unrelated topics and remains relevant throughout.\n\n3. Conciseness: The answer is clear and concise. It covers the main points without unnecessary elaboration or inclusion of extraneous information.\n\nFinal Result:'</li><li>"Reasoning:\n\n1. Context Grounding: The answer provided is well-supported by the document. The document discusses the importance of drawing from one's own experiences, particularly those involving pain and emotion, in order to create genuine and relatable characters.\n2. Relevance: The answer directly addresses the question of what the author believes is the key to creating a connection between the reader and the characters.\n3. Conciseness: The answer is clear and to the point, avoiding unnecessary information.\n\nFinal Result:"</li><li>'Reasoning:\n1. Context Grounding: The answer directly refers to the document, which mentions Mauro Rubin as the CEO of JoinPad during the event.\n2. Relevance: The answer specifically addresses the question asked about the CEO of JoinPad during the event.\n3. Conciseness: The answer is clear, direct, and does not include unnecessary information.\n\nFinal result:'</li></ul> |
138
+
139
+ ## Evaluation
140
+
141
+ ### Metrics
142
+ | Label | Accuracy |
143
+ |:--------|:---------|
144
+ | **all** | 0.88 |
145
+
146
+ ## Uses
147
+
148
+ ### Direct Use for Inference
149
+
150
+ First install the SetFit library:
151
+
152
+ ```bash
153
+ pip install setfit
154
+ ```
155
+
156
+ Then you can load this model and run inference.
157
+
158
+ ```python
159
+ from setfit import SetFitModel
160
+
161
+ # Download from the 🤗 Hub
162
+ model = SetFitModel.from_pretrained("Netta1994/setfit_baai_rag_ds_gpt-4o_cot-instructions_remove_final_evaluation_e1_1726759371.6896")
163
+ # Run inference
164
+ preds = model("Reasoning:
165
+ 1. Context Grounding: The answer accurately describes the components of the Student Guide, which is well-supported by the provided document.
166
+ 2. Relevance: The answer directly addresses the question by listing the components of the British Medieval Student Guide.
167
+ 3. Conciseness: The answer is concise and includes only the necessary details regarding the components of the guide without extraneous information.
168
+
169
+ Final Result:")
170
+ ```
171
+
172
+ <!--
173
+ ### Downstream Use
174
+
175
+ *List how someone could finetune this model on their own dataset.*
176
+ -->
177
+
178
+ <!--
179
+ ### Out-of-Scope Use
180
+
181
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
182
+ -->
183
+
184
+ <!--
185
+ ## Bias, Risks and Limitations
186
+
187
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
188
+ -->
189
+
190
+ <!--
191
+ ### Recommendations
192
+
193
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
194
+ -->
195
+
196
+ ## Training Details
197
+
198
+ ### Training Set Metrics
199
+ | Training set | Min | Median | Max |
200
+ |:-------------|:----|:--------|:----|
201
+ | Word count | 33 | 87.0704 | 188 |
202
+
203
+ | Label | Training Sample Count |
204
+ |:------|:----------------------|
205
+ | 0 | 34 |
206
+ | 1 | 37 |
207
+
208
+ ### Training Hyperparameters
209
+ - batch_size: (16, 16)
210
+ - num_epochs: (1, 1)
211
+ - max_steps: -1
212
+ - sampling_strategy: oversampling
213
+ - num_iterations: 20
214
+ - body_learning_rate: (2e-05, 2e-05)
215
+ - head_learning_rate: 2e-05
216
+ - loss: CosineSimilarityLoss
217
+ - distance_metric: cosine_distance
218
+ - margin: 0.25
219
+ - end_to_end: False
220
+ - use_amp: False
221
+ - warmup_proportion: 0.1
222
+ - l2_weight: 0.01
223
+ - seed: 42
224
+ - eval_max_steps: -1
225
+ - load_best_model_at_end: False
226
+
227
+ ### Training Results
228
+ | Epoch | Step | Training Loss | Validation Loss |
229
+ |:------:|:----:|:-------------:|:---------------:|
230
+ | 0.0056 | 1 | 0.2278 | - |
231
+ | 0.2809 | 50 | 0.2597 | - |
232
+ | 0.5618 | 100 | 0.2455 | - |
233
+ | 0.8427 | 150 | 0.1585 | - |
234
+
235
+ ### Framework Versions
236
+ - Python: 3.10.14
237
+ - SetFit: 1.1.0
238
+ - Sentence Transformers: 3.1.0
239
+ - Transformers: 4.44.0
240
+ - PyTorch: 2.4.1+cu121
241
+ - Datasets: 2.19.2
242
+ - Tokenizers: 0.19.1
243
+
244
+ ## Citation
245
+
246
+ ### BibTeX
247
+ ```bibtex
248
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
249
+ doi = {10.48550/ARXIV.2209.11055},
250
+ url = {https://arxiv.org/abs/2209.11055},
251
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
252
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
253
+ title = {Efficient Few-Shot Learning Without Prompts},
254
+ publisher = {arXiv},
255
+ year = {2022},
256
+ copyright = {Creative Commons Attribution 4.0 International}
257
+ }
258
+ ```
259
+
260
+ <!--
261
+ ## Glossary
262
+
263
+ *Clearly define terms in order to be accessible across audiences.*
264
+ -->
265
+
266
+ <!--
267
+ ## Model Card Authors
268
+
269
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
270
+ -->
271
+
272
+ <!--
273
+ ## Model Card Contact
274
+
275
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
276
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.44.0",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e625638d55bd3b6f156d737bcbb75cdd7ec9074d2547ca9615a72d1db0e28915
3
+ size 437951328
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8463b306fb306cad8751510f3aeca7975047852afe4f45e4045f20fead8861fe
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff