Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,184 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
1 |
---
|
2 |
+
library_name: setfit
|
3 |
+
tags:
|
4 |
+
- setfit
|
5 |
+
- sentence-transformers
|
6 |
+
- text-classification
|
7 |
+
- generated_from_setfit_trainer
|
8 |
+
metrics:
|
9 |
+
- accuracy
|
10 |
+
widget:
|
11 |
+
- text: amy and matthew have a bit of a phony relationship , but the film works in
|
12 |
+
spite of it .
|
13 |
+
- text: it 's refreshing to see a romance this smart .
|
14 |
+
- text: bogdanich is unashamedly pro-serbian and makes little attempt to give voice
|
15 |
+
to the other side .
|
16 |
+
- text: sayles has an eye for the ways people of different ethnicities talk to and
|
17 |
+
about others outside the group .
|
18 |
+
- text: eddie murphy and owen wilson have a cute partnership in i spy , but the movie
|
19 |
+
around them is so often nearly nothing that their charm does n't do a load of
|
20 |
+
good .
|
21 |
+
pipeline_tag: text-classification
|
22 |
+
inference: true
|
23 |
+
base_model: BAAI/bge-small-en-v1.5
|
24 |
+
model-index:
|
25 |
+
- name: SetFit with BAAI/bge-small-en-v1.5
|
26 |
+
results:
|
27 |
+
- task:
|
28 |
+
type: text-classification
|
29 |
+
name: Text Classification
|
30 |
+
dataset:
|
31 |
+
name: Unknown
|
32 |
+
type: unknown
|
33 |
+
split: test
|
34 |
+
metrics:
|
35 |
+
- type: accuracy
|
36 |
+
value: 0.8478857770455793
|
37 |
+
name: Accuracy
|
38 |
+
---
|
39 |
+
# SetFit with BAAI/bge-small-en-v1.5
|
40 |
+
|
41 |
+
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
|
42 |
+
|
43 |
+
The model has been trained using an efficient few-shot learning technique that involves:
|
44 |
+
|
45 |
+
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
|
46 |
+
2. Training a classification head with features from the fine-tuned Sentence Transformer.
|
47 |
+
|
48 |
+
## Model Details
|
49 |
+
|
50 |
+
### Model Description
|
51 |
+
- **Model Type:** SetFit
|
52 |
+
- **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
|
53 |
+
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
|
54 |
+
- **Maximum Sequence Length:** 512 tokens
|
55 |
+
- **Number of Classes:** 2 classes
|
56 |
+
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
|
57 |
+
<!-- - **Language:** Unknown -->
|
58 |
+
<!-- - **License:** Unknown -->
|
59 |
+
|
60 |
+
### Model Sources
|
61 |
+
|
62 |
+
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
|
63 |
+
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
|
64 |
+
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
65 |
+
|
66 |
+
### Model Labels
|
67 |
+
| Label | Examples |
|
68 |
+
|:---------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
69 |
+
| negative | <ul><li>'there might be some sort of credible gender-provoking philosophy submerged here , but who the hell cares ?'</li><li>'represents the depths to which the girls-behaving-badly film has fallen .'</li><li>'-lrb- a -rrb- crushing disappointment .'</li></ul> |
|
70 |
+
| positive | <ul><li>'what saves it ... and makes it one of the better video-game-based flicks , is that the film acknowledges upfront that the plot makes no sense , such that the lack of linearity is the point of emotional and moral departure for protagonist alice .'</li><li>'but it could be , by its art and heart , a necessary one .'</li><li>'a culture-clash comedy that , in addition to being very funny , captures some of the discomfort and embarrassment of being a bumbling american in europe .'</li></ul> |
|
71 |
+
|
72 |
+
## Evaluation
|
73 |
+
|
74 |
+
### Metrics
|
75 |
+
| Label | Accuracy |
|
76 |
+
|:--------|:---------|
|
77 |
+
| **all** | 0.862 |
|
78 |
+
|
79 |
+
## Uses
|
80 |
+
|
81 |
+
### Direct Use for Inference
|
82 |
+
|
83 |
+
First install the SetFit library:
|
84 |
+
|
85 |
+
```bash
|
86 |
+
pip install setfit
|
87 |
+
```
|
88 |
+
|
89 |
+
Then you can load this model and run inference.
|
90 |
+
|
91 |
+
```python
|
92 |
+
from setfit import SetFitModel
|
93 |
+
# Download from the 🤗 Hub
|
94 |
+
model = SetFitModel.from_pretrained("Jorgeutd/setfit-bge-small-v1.5-sst2-50-shot")
|
95 |
+
# Run inference
|
96 |
+
preds = model("it 's refreshing to see a romance this smart .")
|
97 |
+
```
|
98 |
+
|
99 |
+
<!--
|
100 |
+
### Downstream Use
|
101 |
+
|
102 |
+
*List how someone could finetune this model on their own dataset.*
|
103 |
+
-->
|
104 |
+
|
105 |
+
<!--
|
106 |
+
### Out-of-Scope Use
|
107 |
+
|
108 |
+
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
109 |
+
-->
|
110 |
+
|
111 |
+
<!--
|
112 |
+
## Bias, Risks and Limitations
|
113 |
+
|
114 |
+
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
115 |
+
-->
|
116 |
+
|
117 |
+
<!--
|
118 |
+
### Recommendations
|
119 |
+
|
120 |
+
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
121 |
+
-->
|
122 |
+
|
123 |
+
## Training Details
|
124 |
+
|
125 |
+
### Training Set Metrics
|
126 |
+
| Training set | Min | Median | Max |
|
127 |
+
|:-------------|:----|:-------|:----|
|
128 |
+
| Word count | 6 | 22.5 | 45 |
|
129 |
+
|
130 |
+
| Label | Training Sample Count |
|
131 |
+
|:---------|:----------------------|
|
132 |
+
| negative | 50 |
|
133 |
+
| positive | 50 |
|
134 |
+
|
135 |
+
### Training Hyperparameters
|
136 |
+
- batch_size: (16, 16)
|
137 |
+
- num_epochs: (10, 10)
|
138 |
+
- max_steps: -1
|
139 |
+
- sampling_strategy: oversampling
|
140 |
+
- body_learning_rate: (2e-05, 1e-05)
|
141 |
+
- head_learning_rate: 0.01
|
142 |
+
- loss: CosineSimilarityLoss
|
143 |
+
- distance_metric: cosine_distance
|
144 |
+
- margin: 0.25
|
145 |
+
- end_to_end: False
|
146 |
+
- use_amp: False
|
147 |
+
- warmup_proportion: 0.1
|
148 |
+
- seed: 42
|
149 |
+
- eval_max_steps: -1
|
150 |
+
- load_best_model_at_end: False
|
151 |
+
|
152 |
+
### Training Results
|
153 |
+
| Epoch | Step | Training Loss | Validation Loss |
|
154 |
+
|:-----:|:----:|:-------------:|:---------------:|
|
155 |
+
| 0.2 | 1 | 0.2109 | - |
|
156 |
+
| 10.0 | 50 | 0.01 | - |
|
157 |
+
|
158 |
+
### Framework Versions
|
159 |
+
- Python: 3.10.11
|
160 |
+
- SetFit: 1.0.3
|
161 |
+
- Sentence Transformers: 2.3.1
|
162 |
+
- Transformers: 4.37.2
|
163 |
+
- PyTorch: 2.2.0+cu121
|
164 |
+
- Datasets: 2.16.1
|
165 |
+
- Tokenizers: 0.15.1
|
166 |
+
|
167 |
+
## Citation
|
168 |
+
|
169 |
+
### BibTeX
|
170 |
+
```bibtex
|
171 |
+
@article{https://doi.org/10.48550/arxiv.2209.11055,
|
172 |
+
doi = {10.48550/ARXIV.2209.11055},
|
173 |
+
url = {https://arxiv.org/abs/2209.11055},
|
174 |
+
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
|
175 |
+
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
|
176 |
+
title = {Efficient Few-Shot Learning Without Prompts},
|
177 |
+
publisher = {arXiv},
|
178 |
+
year = {2022},
|
179 |
+
copyright = {Creative Commons Attribution 4.0 International}
|
180 |
+
}
|
181 |
+
```
|
182 |
+
---
|
183 |
license: apache-2.0
|
184 |
---
|