mann2107 commited on
Commit
1dacafd
1 Parent(s): 2c31bc9

Push model using huggingface_hub.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - sentence-transformers
6
+ - text-classification
7
+ - generated_from_setfit_trainer
8
+ metrics:
9
+ - accuracy
10
+ widget:
11
+ - text: 'Can you please send me flight quotations for Mr Mthetho Sovara for travel
12
+ to Bologna, Italy as per details below: 7 Oct: JHB to Bologna, Italy 14 Oct: Bologna,
13
+ Italy to JHB'
14
+ - text: Your warranty is about to expire. Click here to extend it and avoid costly
15
+ repairs.
16
+ - text: Family emergency means I won't make my reservation. How can I get my money
17
+ back?
18
+ - text: 'Your flight reservation with Delta Airlines has been confirmed. Flight #DL102
19
+ from JFK to ATL on November 20th, departure at 5:00 PM.'
20
+ - text: I need invoice please with Engela Petzer name
21
+ pipeline_tag: text-classification
22
+ inference: true
23
+ base_model: sentence-transformers/all-MiniLM-L6-v2
24
+ ---
25
+
26
+ # SetFit with sentence-transformers/all-MiniLM-L6-v2
27
+
28
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) as the Sentence Transformer embedding model. A [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance is used for classification.
29
+
30
+ The model has been trained using an efficient few-shot learning technique that involves:
31
+
32
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
33
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
34
+
35
+ ## Model Details
36
+
37
+ ### Model Description
38
+ - **Model Type:** SetFit
39
+ - **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
40
+ - **Classification head:** a [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance
41
+ - **Maximum Sequence Length:** 256 tokens
42
+ - **Number of Classes:** 14 classes
43
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
44
+ <!-- - **Language:** Unknown -->
45
+ <!-- - **License:** Unknown -->
46
+
47
+ ### Model Sources
48
+
49
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
50
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
51
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
52
+
53
+ ### Model Labels
54
+ | Label | Examples |
55
+ |:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
56
+ | 12 | <ul><li>"My itinerary's been turned upside down, forcing me to cancel. What's the process for a refund?"</li><li>'Yesterday my flight AA 273 was canceled and I had to be rebooked on a British airways flight for today. Is there anyway for you to get some sort of refund from American Airlines?I recently cancelled a trip reservation for travel to Midland, TX on 10/3/2019. The confirmation code: WUOQZY. When can I expect to see a refund of the airfare charge, or will I see a credit that I can use for future air travel?'</li><li>'My flight got cancelled. Can you walk me through how to claim a refund?'</li></ul> |
57
+ | 11 | <ul><li>'Please create a new travel profile for: Annette Marie Marshall [email protected] Department code: 4500 Install Level Manager is: Aaron W. Brandt Location: Fort Worth Office'</li><li>'Please add the below employee to our existing system. Sandy Faucett [email protected] '</li><li>'Can you change the saved payment method in my profile?'</li></ul> |
58
+ | 8 | <ul><li>'I need my itinerary for my outbound flight on Feb 5, 256pm, Flt 898. Please send.'</li><li>'Al asked me to locate his return ticket to verify the confirmation number. Can you forward that to me please ?'</li><li>'Just checking in to see about the status of the below 2 members airline itineraries?? Jose Gustavo Robles Leon, Adrian Fabricio Villarreal Alfaro, CLI, FSCP. Thanks very much'</li></ul> |
59
+ | 7 | <ul><li>'Hello, Can you please send me a list my 5 latest invoices? Thanks'</li><li>'I need your assistance in getting all the travel Tev Finger has done to NY from January 2017 until December 2018. Could you please send me all the invoices you have for the two years. You could also send them to me on a spreadsheet. I also need to know the hotels where he stayed in NY.'</li><li>'Please can you provide with the invoices for my stays this month as follows: 1. Premier Splendid Inn Bayshore (07 Aug - 08 Aug) 2. Port Nolloth Beach Shack (14 Aug - 17 Aug)'</li></ul> |
60
+ | 13 | <ul><li>'Boost your credit score by 100 points in just 30 days. Click here to get started.'</li><li>'Your bank account is at risk. Click here to verify your identity and secure your funds.'</li><li>'Claim your free trial of our premium streaming service. No credit card required. Sign up today!'</li></ul> |
61
+ | 3 | <ul><li>"I booked a rental car with a GPS, but the confirmation email doesn't mention the GPS. Can you ensure it's included?"</li><li>'My flight booking was cancelled by the airline, and I need to rebook it. Can you assist me with the rebooking process?'</li><li>"I received a notification that my hotel booking was cancelled, but I didn't request a cancellation. Can you reinstate my booking?"</li></ul> |
62
+ | 6 | <ul><li>'Please be informed that Ms. Laura Palmer did not check out on the scheduled date. We are holding her belongings in the lost and found. Please advise on the next steps.'</li><li>"We have received a request for a baby crib in Ms. Olivia Smith's room. Please confirm if this service is required and if there are any additional charges."</li><li>"Dear Travel Agency, Mr. Noah Lewis's booking includes a complimentary breakfast. Please ensure the guest is aware of this amenity."</li></ul> |
63
+ | 1 | <ul><li>"There seems to be a charge from your agency on my credit card statement that I can't match with any booking. Could you investigate and send me a detailed report?"</li><li>'Did we get charged twice for the same trip? See attached'</li><li>'we have a discrepancy between our travel booking history and our credit card. Who is the right person to talk to about this topic? We have a deduction on our US CC on 08/13/2019 but no according booking was made AIR CANADA 0147405030321 281.88'</li></ul> |
64
+ | 5 | <ul><li>'can you please cancel the Allegiant flights for Jenna Teyshak on Friday, 12/6 and Sunday, 12/8? She is no longer with the company and does not need these flights anymore. Thank you so much!'</li><li>'Please cancel any and all reservations for Joel Morris for his trip. His annual meeting was canceled because of the COVID - 19 Virus.'</li><li>'Please cancel Mike Constantinides flight on Tues, Sept 24th from Nash to Denver. If you could do this as soon as possible that would be great - I am not allowed to modify it myself without having to cancel whole trip'</li></ul> |
65
+ | 4 | <ul><li>'Please book a one-way flight from Houston (IAH) to New Orleans (MSY) on Feb 3rd, 2025, and a hotel for the duration at the Marriott, 555 Canal St.'</li><li>'Kindly assist with a quote for accommodation in Wolmaransstad check in on the 13/09/2023 for 3 people.'</li><li>'I need a return flight from Seattle (SEA) to Denver (DEN), departing on November 5th and coming back on November 10th.'</li></ul> |
66
+ | 2 | <ul><li>'Your car rental reservation with Thrifty has been confirmed. Pickup at ORD Airport, October 25th, 11:00 AM.'</li><li>'Hotel reservation confirmed at The Ritz-Carlton, San Francisco. Check-in: December 1st, Check-out: December 5th. Booking reference: #123456.'</li><li>'Your flight booking was unsuccessful due to invalid travel dates. Please check the dates and try again.'</li></ul> |
67
+ | 9 | <ul><li>'Can you please help with upgrading the below Nov. 8th PEK to LAX lag to an Economy Plus Seat Aisle. I tried to go to Air China directly and because this ticket was booked through a Third Party it will not allow me to make any changes. I have ccd Sonia (the Traveler) in case you have any questions. Please use the credit card on file for any additional charges.'</li><li>"Linda Thank you for taking my call just now. Please find as discussed Pat ' s itinerary. May you please change my flight from Raleigh to Newark, so it is at the same time as Pat's flight. Our rest of the times are the same. We leave for Johannesburg at 20:55."</li><li>'I understand that it is a non-refundable booking. Somehow my customer asked for rescheduling the meeting and I wonder if it is possible to ask the hotel to change the date of this booking?'</li></ul> |
68
+ | 0 | <ul><li>'I concur with the travel arrangements. Please confirm the tickets and accommodations.'</li><li>'Given the approval from our end, please initiate the booking process for the mentioned journey.'</li><li>'there I just got a request to assist with approval to issue a flight ticket without a PO for Righardt Coetzee. Requisition 786247 was only approved at 18h40 - see below. I am going to give approval for the ticket to be issued since we do have an approved requisition, but this will be logged as an after-hour call and the project will pay an additional R290 for CWT after-hours. Below the information as provided in the requisition. The PO will be issued on 20 September.'</li></ul> |
69
+ | 10 | <ul><li>'Please see the attached statement of due invoices and arrange for payment. If already settled, kindly provide proof of payment.'</li><li>'To Whom it may concern, Mr. Steven Anderson has extended his stay, but the credit card on file is now declining. Could someone call the hotel at 815-872-5000 and provide a current card?'</li><li>'Please assist with payment for the conference room booking at Hilton last week.'</li></ul> |
70
+
71
+ ## Uses
72
+
73
+ ### Direct Use for Inference
74
+
75
+ First install the SetFit library:
76
+
77
+ ```bash
78
+ pip install setfit
79
+ ```
80
+
81
+ Then you can load this model and run inference.
82
+
83
+ ```python
84
+ from setfit import SetFitModel
85
+
86
+ # Download from the 🤗 Hub
87
+ model = SetFitModel.from_pretrained("mann2107/BCMPIIRAB_ALL_Test")
88
+ # Run inference
89
+ preds = model("I need invoice please with Engela Petzer name")
90
+ ```
91
+
92
+ <!--
93
+ ### Downstream Use
94
+
95
+ *List how someone could finetune this model on their own dataset.*
96
+ -->
97
+
98
+ <!--
99
+ ### Out-of-Scope Use
100
+
101
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
102
+ -->
103
+
104
+ <!--
105
+ ## Bias, Risks and Limitations
106
+
107
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
108
+ -->
109
+
110
+ <!--
111
+ ### Recommendations
112
+
113
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
114
+ -->
115
+
116
+ ## Training Details
117
+
118
+ ### Training Set Metrics
119
+ | Training set | Min | Median | Max |
120
+ |:-------------|:----|:--------|:----|
121
+ | Word count | 1 | 26.3827 | 136 |
122
+
123
+ | Label | Training Sample Count |
124
+ |:------|:----------------------|
125
+ | 0 | 26 |
126
+ | 1 | 26 |
127
+ | 2 | 26 |
128
+ | 3 | 25 |
129
+ | 4 | 25 |
130
+ | 5 | 26 |
131
+ | 6 | 25 |
132
+ | 7 | 25 |
133
+ | 8 | 26 |
134
+ | 9 | 26 |
135
+ | 10 | 25 |
136
+ | 11 | 26 |
137
+ | 12 | 26 |
138
+ | 13 | 25 |
139
+
140
+ ### Training Hyperparameters
141
+ - batch_size: (8, 8)
142
+ - num_epochs: (1, 1)
143
+ - max_steps: -1
144
+ - sampling_strategy: oversampling
145
+ - num_iterations: 1
146
+ - body_learning_rate: (0.0009623401597937572, 0.0009623401597937572)
147
+ - head_learning_rate: 0.0009623401597937572
148
+ - loss: CosineSimilarityLoss
149
+ - distance_metric: cosine_distance
150
+ - margin: 0.25
151
+ - end_to_end: False
152
+ - use_amp: False
153
+ - warmup_proportion: 0.1
154
+ - seed: 42
155
+ - eval_max_steps: -1
156
+ - load_best_model_at_end: False
157
+
158
+ ### Training Results
159
+ | Epoch | Step | Training Loss | Validation Loss |
160
+ |:------:|:----:|:-------------:|:---------------:|
161
+ | 0.0111 | 1 | 0.2042 | - |
162
+ | 0.5556 | 50 | 0.1917 | - |
163
+
164
+ ### Framework Versions
165
+ - Python: 3.11.7
166
+ - SetFit: 1.1.0.dev0
167
+ - Sentence Transformers: 2.2.2
168
+ - Transformers: 4.35.2
169
+ - PyTorch: 2.1.1+cu121
170
+ - Datasets: 2.14.5
171
+ - Tokenizers: 0.15.1
172
+
173
+ ## Citation
174
+
175
+ ### BibTeX
176
+ ```bibtex
177
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
178
+ doi = {10.48550/ARXIV.2209.11055},
179
+ url = {https://arxiv.org/abs/2209.11055},
180
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
181
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
182
+ title = {Efficient Few-Shot Learning Without Prompts},
183
+ publisher = {arXiv},
184
+ year = {2022},
185
+ copyright = {Creative Commons Attribution 4.0 International}
186
+ }
187
+ ```
188
+
189
+ <!--
190
+ ## Glossary
191
+
192
+ *Clearly define terms in order to be accessible across audiences.*
193
+ -->
194
+
195
+ <!--
196
+ ## Model Card Authors
197
+
198
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
199
+ -->
200
+
201
+ <!--
202
+ ## Model Card Contact
203
+
204
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
205
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/root/.cache/torch/sentence_transformers/sentence-transformers_all-MiniLM-L6-v2/",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.35.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.6.1",
5
+ "pytorch": "1.8.1"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4444aea2f46a1c16f638ed0272a2e7c5d1cbd8b1967a02a2ff565d8a8b640f8e
3
+ size 90864192
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac3015d5d055c8a8a801db3a73b3b3a029c17542d2b02b16fcaaf7514f633b80
3
+ size 23092
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 128,
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff