yaniseuranova
commited on
Commit
•
47db620
1
Parent(s):
7f18162
Add SetFit model
Browse files- README.md +43 -40
- config.json +1 -1
- model.safetensors +1 -1
- model_head.pkl +1 -1
- sentencepiece.bpe.model +3 -0
README.md
CHANGED
@@ -9,16 +9,16 @@ base_model: BAAI/bge-m3
|
|
9 |
metrics:
|
10 |
- accuracy
|
11 |
widget:
|
12 |
-
- text:
|
13 |
-
|
14 |
-
- text:
|
15 |
-
|
16 |
-
- text: What is the primary
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
pipeline_tag: text-classification
|
23 |
inference: true
|
24 |
model-index:
|
@@ -65,10 +65,10 @@ The model has been trained using an efficient few-shot learning technique that i
|
|
65 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
66 |
|
67 |
### Model Labels
|
68 |
-
| Label | Examples
|
69 |
-
|
70 |
-
| lexical | <ul><li>'What is the
|
71 |
-
| semantic | <ul><li>
|
72 |
|
73 |
## Evaluation
|
74 |
|
@@ -95,7 +95,7 @@ from setfit import SetFitModel
|
|
95 |
# Download from the 🤗 Hub
|
96 |
model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
|
97 |
# Run inference
|
98 |
-
preds = model("
|
99 |
```
|
100 |
|
101 |
<!--
|
@@ -127,12 +127,12 @@ preds = model("Qui est Robin Mancini ?")
|
|
127 |
### Training Set Metrics
|
128 |
| Training set | Min | Median | Max |
|
129 |
|:-------------|:----|:--------|:----|
|
130 |
-
| Word count | 4 | 19.
|
131 |
|
132 |
| Label | Training Sample Count |
|
133 |
|:---------|:----------------------|
|
134 |
-
| lexical |
|
135 |
-
| semantic |
|
136 |
|
137 |
### Training Hyperparameters
|
138 |
- batch_size: (16, 16)
|
@@ -154,27 +154,30 @@ preds = model("Qui est Robin Mancini ?")
|
|
154 |
### Training Results
|
155 |
| Epoch | Step | Training Loss | Validation Loss |
|
156 |
|:-------:|:-------:|:-------------:|:---------------:|
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 1.0
|
163 |
-
| 1.
|
164 |
-
| 1.
|
165 |
-
| 1.
|
166 |
-
| 1.
|
167 |
-
|
|
168 |
-
| 2.
|
169 |
-
| 2.
|
170 |
-
| 2.
|
171 |
-
| 2.
|
172 |
-
|
|
173 |
-
|
|
174 |
-
| 3.
|
175 |
-
| 3.
|
176 |
-
| 3.
|
177 |
-
|
|
|
|
|
|
|
|
178 |
|
179 |
* The bold row denotes the saved checkpoint.
|
180 |
### Framework Versions
|
@@ -182,7 +185,7 @@ preds = model("Qui est Robin Mancini ?")
|
|
182 |
- SetFit: 1.0.3
|
183 |
- Sentence Transformers: 2.6.1
|
184 |
- Transformers: 4.39.0
|
185 |
-
- PyTorch: 2.3.
|
186 |
- Datasets: 2.18.0
|
187 |
- Tokenizers: 0.15.2
|
188 |
|
|
|
9 |
metrics:
|
10 |
- accuracy
|
11 |
widget:
|
12 |
+
- text: How doCompaniesbalanceIndividualCreativitywithTeamCollaboration to driveInnovationinthe
|
13 |
+
WORKPlace?
|
14 |
+
- text: How do the values of a learning organization impact its ability to innovate
|
15 |
+
and respond to constant change?
|
16 |
+
- text: What is the primary function of the Domain Name System (DNS) layer in the
|
17 |
+
Internet Protocol Stack, as defined by ICANN?
|
18 |
+
- text: What distinguishes a transforming industry from one that merely innovates
|
19 |
+
to existing practices?
|
20 |
+
- text: How can artificial intelligence systems balance individual autonomy with collective
|
21 |
+
responsibility in decision-making processes?
|
22 |
pipeline_tag: text-classification
|
23 |
inference: true
|
24 |
model-index:
|
|
|
65 |
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
|
66 |
|
67 |
### Model Labels
|
68 |
+
| Label | Examples |
|
69 |
+
|:---------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
70 |
+
| lexical | <ul><li>'What is the primary function of the Apache Kafka distributed streaming platform in Big Data processing?'</li><li>"What is the primary difference between Hadoop's FileSystem-based architecture and Apache Cassandra's distributed, masterlessArchitecture in scale-out design?"</li><li>'What is the main difference between optimistic concurrency control and pessimistic concurrency control in database management systems?'</li></ul> |
|
71 |
+
| semantic | <ul><li>"How does organizational morale impact the competitiveness of a company in today's fast-paced market?"</li><li>'How do organizations balance individual creativity with collective goal achievement in a dynamic environment?'</li><li>'What is a key challenge faced by managers in sustaining a work culture that encourages creativity, innovation, and critical thinking within the technological industry globally?'</li></ul> |
|
72 |
|
73 |
## Evaluation
|
74 |
|
|
|
95 |
# Download from the 🤗 Hub
|
96 |
model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
|
97 |
# Run inference
|
98 |
+
preds = model("What distinguishes a transforming industry from one that merely innovates to existing practices?")
|
99 |
```
|
100 |
|
101 |
<!--
|
|
|
127 |
### Training Set Metrics
|
128 |
| Training set | Min | Median | Max |
|
129 |
|:-------------|:----|:--------|:----|
|
130 |
+
| Word count | 4 | 19.1839 | 42 |
|
131 |
|
132 |
| Label | Training Sample Count |
|
133 |
|:---------|:----------------------|
|
134 |
+
| lexical | 43 |
|
135 |
+
| semantic | 44 |
|
136 |
|
137 |
### Training Hyperparameters
|
138 |
- batch_size: (16, 16)
|
|
|
154 |
### Training Results
|
155 |
| Epoch | Step | Training Loss | Validation Loss |
|
156 |
|:-------:|:-------:|:-------------:|:---------------:|
|
157 |
+
| 0.0041 | 1 | 0.2391 | - |
|
158 |
+
| 0.2066 | 50 | 0.0033 | - |
|
159 |
+
| 0.4132 | 100 | 0.0007 | - |
|
160 |
+
| 0.6198 | 150 | 0.0007 | - |
|
161 |
+
| 0.8264 | 200 | 0.0007 | - |
|
162 |
+
| **1.0** | **242** | **-** | **0.0001** |
|
163 |
+
| 1.0331 | 250 | 0.0005 | - |
|
164 |
+
| 1.2397 | 300 | 0.0004 | - |
|
165 |
+
| 1.4463 | 350 | 0.0004 | - |
|
166 |
+
| 1.6529 | 400 | 0.0003 | - |
|
167 |
+
| 1.8595 | 450 | 0.0004 | - |
|
168 |
+
| 2.0 | 484 | - | 0.0001 |
|
169 |
+
| 2.0661 | 500 | 0.0003 | - |
|
170 |
+
| 2.2727 | 550 | 0.0003 | - |
|
171 |
+
| 2.4793 | 600 | 0.0002 | - |
|
172 |
+
| 2.6860 | 650 | 0.0003 | - |
|
173 |
+
| 2.8926 | 700 | 0.0002 | - |
|
174 |
+
| 3.0 | 726 | - | 0.0001 |
|
175 |
+
| 3.0992 | 750 | 0.0003 | - |
|
176 |
+
| 3.3058 | 800 | 0.0002 | - |
|
177 |
+
| 3.5124 | 850 | 0.0002 | - |
|
178 |
+
| 3.7190 | 900 | 0.0002 | - |
|
179 |
+
| 3.9256 | 950 | 0.0003 | - |
|
180 |
+
| 4.0 | 968 | - | 0.0001 |
|
181 |
|
182 |
* The bold row denotes the saved checkpoint.
|
183 |
### Framework Versions
|
|
|
185 |
- SetFit: 1.0.3
|
186 |
- Sentence Transformers: 2.6.1
|
187 |
- Transformers: 4.39.0
|
188 |
+
- PyTorch: 2.3.1+cu121
|
189 |
- Datasets: 2.18.0
|
190 |
- Tokenizers: 0.15.2
|
191 |
|
config.json
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
{
|
2 |
-
"_name_or_path": "checkpoints/
|
3 |
"architectures": [
|
4 |
"XLMRobertaModel"
|
5 |
],
|
|
|
1 |
{
|
2 |
+
"_name_or_path": "checkpoints/step_242",
|
3 |
"architectures": [
|
4 |
"XLMRobertaModel"
|
5 |
],
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 2271064456
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b1b888990ee5269d1f4c3795f8aeeb46da209d188e03543ea23b7fa884aaf2b5
|
3 |
size 2271064456
|
model_head.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 9087
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:96d22b0c74de93b5a70d706bf42366826ce9a80c2d3a555a2fadaed9e3d0c5e3
|
3 |
size 9087
|
sentencepiece.bpe.model
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cfc8146abe2a0488e9e2a0c56de7952f7c11ab059eca145a0a727afce0db2865
|
3 |
+
size 5069051
|