Netta1994 commited on
Commit
5ae3c24
1 Parent(s): 32c52be

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,253 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: 'Reasoning:
14
+
15
+ The answer correctly states that the College of Arts and Letters at Notre Dame
16
+ was created in 1842, which is directly supported by the document. The document
17
+ specifies that the College of Arts and Letters was established in 1842 and is
18
+ relevant and directly addresses the question without including unnecessary information.
19
+
20
+
21
+ Evaluation:'
22
+ - text: 'Reasoning:
23
+
24
+ The provided answer states, "The average student at Notre Dame travels more than
25
+ 750 miles to study there," which directly addresses the question asked. The document
26
+ confirms the accuracy of this information with the statement, "the average student
27
+ traveled more than 750 miles to Notre Dame." The answer is well-grounded in the
28
+ document, relevant to the specific question, and concisewithout extraneous information.
29
+
30
+
31
+ Evaluation:'
32
+ - text: 'Reasoning:
33
+
34
+ The provided answer correctly identifies Mick LaSalle as the writer for the San
35
+ Francisco Chronicle who awarded "Spectre" with a perfect score. This is directly
36
+ supported by the document, which states, "Other positive reviews from Mick LaSalle
37
+ from the San Francisco Chronicle,gave it a perfect 100 score..."
38
+
39
+
40
+ Evaluation:'
41
+ - text: 'Reasoning:
42
+
43
+ The given answer states that "The Review of Politics was inspired by German Catholic
44
+ journals and predominantly featured articles written by Karl Marx." While it correctly
45
+ identifies that the Review of Politics was inspired by German Catholic journals,
46
+ the claim that it predominantly featured articles written by Karl Marx is incorrect
47
+ and not supported by the provided document. The document makes no mention of Karl
48
+ Marx or indicates that his work was featured in the Review. Instead, it lists
49
+ other intellectual leaders like Gurian, Jacques Maritain, and Leo Richard Ward.
50
+
51
+
52
+ Evaluation:'
53
+ - text: 'Reasoning:
54
+
55
+ The provided document states that Forbes.com ranked Notre Dame 8th among research
56
+ universities in the United States. The answer given precisely matches this detail
57
+ from the document. It accurately addresses the specific question asked, without
58
+ deviating into unrelated topics or providing unnecessary information.
59
+
60
+
61
+ Evaluation:'
62
+ inference: true
63
+ model-index:
64
+ - name: SetFit with BAAI/bge-base-en-v1.5
65
+ results:
66
+ - task:
67
+ type: text-classification
68
+ name: Text Classification
69
+ dataset:
70
+ name: Unknown
71
+ type: unknown
72
+ split: test
73
+ metrics:
74
+ - type: accuracy
75
+ value: 0.9491525423728814
76
+ name: Accuracy
77
+ ---
78
+
79
+ # SetFit with BAAI/bge-base-en-v1.5
80
+
81
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
82
+
83
+ The model has been trained using an efficient few-shot learning technique that involves:
84
+
85
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
86
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
87
+
88
+ ## Model Details
89
+
90
+ ### Model Description
91
+ - **Model Type:** SetFit
92
+ - **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
93
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
94
+ - **Maximum Sequence Length:** 512 tokens
95
+ - **Number of Classes:** 2 classes
96
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
97
+ <!-- - **Language:** Unknown -->
98
+ <!-- - **License:** Unknown -->
99
+
100
+ ### Model Sources
101
+
102
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
103
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
104
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
105
+
106
+ ### Model Labels
107
+ | Label | Examples |
108
+ |:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
109
+ | 1 | <ul><li>'Reasoning:\nThe answer correctly identifies Joan Gaspart as the individual who resigned from the presidency of Barcelona after the team\'s poor showing in the 2003 season. This is directly supported by the document, which explicitly states that "club president Joan Gaspart resigned, his position having been made completely untenable by such a disastrous season on top of the club\'s overall decline in fortunes since he became president three years prior." The answer is concise and directly relevant to the question without including any extraneous information.\n\nEvaluation:'</li><li>"Reasoning:\nThe provided answer directly addresses the question of why it is recommended to hire a professional residential electrician like O'Hara Electric for electrical work in your house. The answer highlights key points such as the hazards of working with electricity, the potential for injury, and the long-term implications of improperly done electrical work. It also mentions the risk involved even in seemingly simple tasks like smoke detector installation and emphasizes the benefits of having the job done correctly the first time by a professional. The details arewell-supported by the document.\n\nEvaluation:"</li><li>'Reasoning:\nThe answer "The title of Aerosmith\'s 1987 comeback album was \'Permanent Vacation\'" is directly supported by the provided document. The document explicitly states, "Aerosmith\'s comeback album Permanent Vacation (1987) would begin a decade long revival of their popularity." The answer is directly related to the question asked and does not deviate into unrelated topics, ensuring conciseness and relevance.\n\nEvaluation:'</li></ul> |
110
+ | 0 | <ul><li>'Reasoning:\nThe answer provides a well-supported response that aligns directly with the content presented in the document. It addresses various strategies to combat smoking cravings, such as identifying and avoiding triggers, using distractions, and engaging in alternative activities. Specific triggers, like daily routines and social situations, are described in both the answer and the document. Additionally, the advice on using chewing licorice root and engaging in smoke-free activities is related to the suggestions given in the document. The answer is clear, concise, and stays relevant to the question throughout.\n\nFinal Evaluation: \nEvaluation:'</li><li>"Reasoning:\nThe provided answer accurately captures the challenges Amy Bloom faces when starting a significant writing project, as detailed in the document. Notably, it mentions the difficulty of getting started, the need to clear mental space, and to recalibrate her daily life, which are all points grounded in the text. The answer also covers her becoming less involved in everyday life and spending less time on domestic concerns, which aligns well with the provided passage. However, the part about traveling to a remote island with no internet access is not mentioned in the document and appears to be fabricated, which detracts from the answer's context grounding.\n\nFinal Result:"</li><li>'Reasoning:\nThe provided answer incorrectly states the price and location of the 6 bedroom detached house. According to the document, the 6 bedroom detached house is for sale at a price of £950,000 and is located at Willow Drive, Twyford, Reading, Berkshire, RG10. The answer gives a different priceand an incorrect location.\n\nFinal Evaluation:'</li></ul> |
111
+
112
+ ## Evaluation
113
+
114
+ ### Metrics
115
+ | Label | Accuracy |
116
+ |:--------|:---------|
117
+ | **all** | 0.9492 |
118
+
119
+ ## Uses
120
+
121
+ ### Direct Use for Inference
122
+
123
+ First install the SetFit library:
124
+
125
+ ```bash
126
+ pip install setfit
127
+ ```
128
+
129
+ Then you can load this model and run inference.
130
+
131
+ ```python
132
+ from setfit import SetFitModel
133
+
134
+ # Download from the 🤗 Hub
135
+ model = SetFitModel.from_pretrained("Netta1994/setfit_baai_squad_gpt-4o_improved-cot-instructions_chat_few_shot_generated_remove_fin")
136
+ # Run inference
137
+ preds = model("Reasoning:
138
+ The provided answer correctly identifies Mick LaSalle as the writer for the San Francisco Chronicle who awarded \"Spectre\" with a perfect score. This is directly supported by the document, which states, \"Other positive reviews from Mick LaSalle from the San Francisco Chronicle,gave it a perfect 100 score...\"
139
+
140
+ Evaluation:")
141
+ ```
142
+
143
+ <!--
144
+ ### Downstream Use
145
+
146
+ *List how someone could finetune this model on their own dataset.*
147
+ -->
148
+
149
+ <!--
150
+ ### Out-of-Scope Use
151
+
152
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
153
+ -->
154
+
155
+ <!--
156
+ ## Bias, Risks and Limitations
157
+
158
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
159
+ -->
160
+
161
+ <!--
162
+ ### Recommendations
163
+
164
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
165
+ -->
166
+
167
+ ## Training Details
168
+
169
+ ### Training Set Metrics
170
+ | Training set | Min | Median | Max |
171
+ |:-------------|:----|:--------|:----|
172
+ | Word count | 33 | 76.9045 | 176 |
173
+
174
+ | Label | Training Sample Count |
175
+ |:------|:----------------------|
176
+ | 0 | 95 |
177
+ | 1 | 104 |
178
+
179
+ ### Training Hyperparameters
180
+ - batch_size: (16, 16)
181
+ - num_epochs: (1, 1)
182
+ - max_steps: -1
183
+ - sampling_strategy: oversampling
184
+ - num_iterations: 20
185
+ - body_learning_rate: (2e-05, 2e-05)
186
+ - head_learning_rate: 2e-05
187
+ - loss: CosineSimilarityLoss
188
+ - distance_metric: cosine_distance
189
+ - margin: 0.25
190
+ - end_to_end: False
191
+ - use_amp: False
192
+ - warmup_proportion: 0.1
193
+ - l2_weight: 0.01
194
+ - seed: 42
195
+ - eval_max_steps: -1
196
+ - load_best_model_at_end: False
197
+
198
+ ### Training Results
199
+ | Epoch | Step | Training Loss | Validation Loss |
200
+ |:------:|:----:|:-------------:|:---------------:|
201
+ | 0.0020 | 1 | 0.2375 | - |
202
+ | 0.1004 | 50 | 0.2548 | - |
203
+ | 0.2008 | 100 | 0.2339 | - |
204
+ | 0.3012 | 150 | 0.0973 | - |
205
+ | 0.4016 | 200 | 0.0347 | - |
206
+ | 0.5020 | 250 | 0.0125 | - |
207
+ | 0.6024 | 300 | 0.0058 | - |
208
+ | 0.7028 | 350 | 0.0039 | - |
209
+ | 0.8032 | 400 | 0.0033 | - |
210
+ | 0.9036 | 450 | 0.0023 | - |
211
+
212
+ ### Framework Versions
213
+ - Python: 3.10.14
214
+ - SetFit: 1.1.0
215
+ - Sentence Transformers: 3.1.1
216
+ - Transformers: 4.44.0
217
+ - PyTorch: 2.4.0+cu121
218
+ - Datasets: 3.0.0
219
+ - Tokenizers: 0.19.1
220
+
221
+ ## Citation
222
+
223
+ ### BibTeX
224
+ ```bibtex
225
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
226
+ doi = {10.48550/ARXIV.2209.11055},
227
+ url = {https://arxiv.org/abs/2209.11055},
228
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
229
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
230
+ title = {Efficient Few-Shot Learning Without Prompts},
231
+ publisher = {arXiv},
232
+ year = {2022},
233
+ copyright = {Creative Commons Attribution 4.0 International}
234
+ }
235
+ ```
236
+
237
+ <!--
238
+ ## Glossary
239
+
240
+ *Clearly define terms in order to be accessible across audiences.*
241
+ -->
242
+
243
+ <!--
244
+ ## Model Card Authors
245
+
246
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
247
+ -->
248
+
249
+ <!--
250
+ ## Model Card Contact
251
+
252
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
253
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.44.0",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.1",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b058faaa596edd69b23038325e6860c4a1f1b0f8a03e6f89791a171f09a2c28
3
+ size 437951328
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f006db18912008dfd2fa5c3b984b1ae0f0f0050b2d4fc3fb193f90d22f67dfa9
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff