filipiv commited on
Commit
c1f666d
·
verified ·
1 Parent(s): e1a8a29

Upload 11 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,560 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ metrics:
7
+ - pearson_cosine
8
+ - spearman_cosine
9
+ - pearson_manhattan
10
+ - spearman_manhattan
11
+ - pearson_euclidean
12
+ - spearman_euclidean
13
+ - pearson_dot
14
+ - spearman_dot
15
+ - pearson_max
16
+ - spearman_max
17
+ pipeline_tag: sentence-similarity
18
+ tags:
19
+ - sentence-transformers
20
+ - sentence-similarity
21
+ - feature-extraction
22
+ - generated_from_trainer
23
+ - dataset_size:3192024
24
+ - loss:CosineSimilarityLoss
25
+ widget:
26
+ - source_sentence: Must have experience in interdisciplinary collaboration
27
+ sentences:
28
+ - Nurse Coordinator specializing in advanced heart failure programs at The Queen's
29
+ Health System. Skilled in patient care coordination, clinical assessments, and
30
+ interdisciplinary collaboration. Experienced in managing complex health cases
31
+ and ensuring compliance with healthcare regulations. Proficient in utilizing advanced
32
+ medical technologies and technologies to enhance patient outcomes. Strong background
33
+ in nonprofit healthcare environments, contributing to optimal health and wellness
34
+ initiatives.
35
+ - Administrative Assistant in the judiciary with experience at the Minnesota Judicial
36
+ Branch and Mayo Clinic. Skilled in managing administrative tasks, coordinating
37
+ schedules, and supporting judicial processes. Proficient in office software and
38
+ communication tools. Previous roles include bank teller positions, enhancing customer
39
+ service and financial transactions. Strong organizational skills and attention
40
+ to detail, contributing to efficient operations in high-pressure environments.
41
+ - Area Manager in facilities services with expertise in managing public parks, campgrounds,
42
+ and recreational facilities. Skilled in operational management, team leadership,
43
+ and customer service. Proven track record in enhancing service delivery and operational
44
+ efficiency. Previous roles include Management Team and Accounts Payable Manager,
45
+ demonstrating versatility across various industries. Strong background in office
46
+ management and office operations, contributing to a well-rounded understanding
47
+ of facility management practices.
48
+ - source_sentence: Must have a customer service orientation
49
+ sentences:
50
+ - Research Assistant in biotechnology with expertise in Molecular Biology, Protein
51
+ Expression, Purification, and Crystallization. Currently employed at Seagen, contributing
52
+ to innovative cancer treatments. Holds a B.S. in Biochemistry and minors in Chemistry
53
+ and Spanish. Previous experience includes roles as a Manufacturing Technician
54
+ at AGC Biologics and undergraduate research at NG Lab and Mueller Lab, focusing
55
+ on recombinant human proteins and protein processing. Proficient in leading project
56
+ cooperation and public speaking.
57
+ - Instructional Developer with a Master's in Human Resource Development, specializing
58
+ in learning solutions across various media platforms. Experienced in storyboarding,
59
+ animation, videography, and post-production. Proven track record in e-learning
60
+ design and development, team leadership, and creative problem-solving. Currently
61
+ employed at The University of Texas Health Science Center at Houston, focusing
62
+ on enhancing organizational value through tailored corporate learning. Previous
63
+ roles include Learning Consultant at Strategic Ascent and Assistant Manager at
64
+ Cicis Pizza. Strong background in healthcare and professional training industries.
65
+ - Human Resource professional with expertise in hiring, compliance, benefits, and
66
+ compensation within the hospitality and semiconductor industries. Currently a
67
+ Talent Acquisition Specialist at MKS Instruments, skilled in relationship building
68
+ and attention to detail. Previous roles include Recruitment Manager at Block by
69
+ Block and Talent Acquisition Specialist at Manpower. Proficient in advanced computer
70
+ skills and a customer service orientation. Experienced in staffing management
71
+ and recruitment strategies, with a strong focus on enhancing workforce capabilities
72
+ and fostering client relationships.
73
+ - source_sentence: Must be proficient in graphic design software
74
+ sentences:
75
+ - Senior Software Engineer with expertise in developing innovative solutions for
76
+ the aviation and defense industries. Currently at Delta Flight Products, specializing
77
+ in aircraft cabin interiors and avionics. Proficient in backend ETL processes,
78
+ REST API development, and software development life cycle. Previous experience
79
+ includes roles at Cisco, Thales, Safran, and FatPipe Networks, focusing on enhancing
80
+ operational efficiency and user experience. Holds multiple patents for web application
81
+ design and deployment. Strong background in collaborating with cross-functional
82
+ teams to deliver high-quality software solutions.
83
+ - Client Advisor in financial services with a strong background in luxury goods
84
+ and retail. Currently at Louis Vuitton, specializing in client relationship management
85
+ and personalized service. Previously worked at Salvatore Ferragano, enhancing
86
+ client engagement and driving sales. Experienced in marketing management from
87
+ SkPros, focusing on brand strategy and market analysis. Proficient in leveraging
88
+ data to inform decision-making and improve client experiences.
89
+ - Weld Process Specialist at Airgas with expertise in industrial automation and
90
+ chemicals. Skilled in Resistance weld gun calibration, schedule database management,
91
+ and asset locating matrix creation. Previous experience as a Welding Engineer
92
+ at R&E Automated, providing support in automation systems for manufacturing applications.
93
+ Proficient in DCEN and various welding techniques, including Fanuc and Motoman.
94
+ Background includes roles in drafting and welding, enhancing fabrication efficiency
95
+ and quality. Strong foundation in mechanical design and engineering principles,
96
+ with a focus on improving performance and performance in manufacturing environments.
97
+ - source_sentence: Must have experience in pharmaceutical marketing
98
+ sentences:
99
+ - Brand Influencer specializing in Black Literary, Culture, and Lifestyle. Certified
100
+ UrbanAg with over 20 years of experience in urban agriculture consulting and retail
101
+ operations. Currently supervises community gardens at Chicago Botanic Garden,
102
+ educating residents on organic growing methods and addressing nutrition, food
103
+ security, and healthy lifestyle options. Previously served as president of Af-Am
104
+ Bookstore, demonstrating entrepreneurial skills and community engagement. Expertise
105
+ in marketing and advertising, with a focus on enhancing community engagement and
106
+ promoting sustainable practices.
107
+ - Experienced Studio Manager and Executive Producer in media production, specializing
108
+ in immersive entertainment and virtual environments. Proficient in business planning,
109
+ team building, fundraising, and management. Co-founder of Dirty Secret, focusing
110
+ on brand activation and custom worlds. Previous roles at Wevr involved production
111
+ coordination and project management, with a strong background in arts and design.
112
+ Holds a degree from California State University-Los Angeles.
113
+ - Owner and CEO of Cake N Wings, a catering company specializing in food and travel
114
+ PR. Experienced in public relations across health, technology, and entertainment
115
+ sectors. Proven track record in developing innovative urban cuisine and enhancing
116
+ customer experiences. Previous roles include account executive at Development
117
+ Counsellors International and public relations manager at Creole Restaurant. Skilled
118
+ in brand development, event management, and community engagement.
119
+ - source_sentence: Must have experience in software development
120
+ sentences:
121
+ - Multi-skilled Business Analytics professional with a Master’s in Business Analytics
122
+ and a dual MBA. Experienced in data analytics, predictive modeling, and project
123
+ management within the health and wellness sector. Proficient in extracting, summarizing,
124
+ and analyzing claims databases and healthcare analytics. Skilled in statistical
125
+ analysis, database management, and data visualization. Previous roles include
126
+ Business Analytics Advisor at Cigna Healthcare and Informatics Senior Specialist
127
+ at Cigna Healthcare. Strong leadership and project management abilities, with
128
+ a solid foundation in healthcare economics and outcomes observational research.
129
+ Familiar with Base SAS 9.2, SAS EG, SAS EM, SAS JMP, Tableau, and Oracle Crystal
130
+ Ball.
131
+ - Assistant Vice President in commercial real estate financing with a strong background
132
+ in banking. Experienced in business banking and branch management, having held
133
+ roles as Assistant Vice President and Business Banking Officer. Proven track record
134
+ in business development and branch operations within a large independent bank.
135
+ Skilled in building client relationships and driving financial growth. Holds expertise
136
+ in managing diverse teams and enhancing operational efficiency. Previous experience
137
+ includes branch management across multiple branches, demonstrating a commitment
138
+ to community engagement and financial wellness.
139
+ - CEO of IMPROVLearning, specializing in e-learning and driver education. Founded
140
+ and managed multiple ventures in training, healthcare, and real estate. Proven
141
+ track record of expanding product offerings and achieving recognition on the Inc
142
+ 500/5000 list. Active board member of the LA Chapter of the Entrepreneur Organization,
143
+ contributing to the growth of over 3 million students. Experienced in venture
144
+ capital and entrepreneurship, with a focus on innovative training solutions and
145
+ community engagement. Active member of various organizations, including the Entrepreneurs'
146
+ Organization and the Los Angeles County Business Federation.
147
+ model-index:
148
+ - name: SentenceTransformer based on sentence-transformers/multi-qa-MiniLM-L6-cos-v1
149
+ results:
150
+ - task:
151
+ type: semantic-similarity
152
+ name: Semantic Similarity
153
+ dataset:
154
+ name: validation
155
+ type: validation
156
+ metrics:
157
+ - type: pearson_cosine
158
+ value: 0.9594453206302572
159
+ name: Pearson Cosine
160
+ - type: spearman_cosine
161
+ value: 0.860568334150162
162
+ name: Spearman Cosine
163
+ - type: pearson_manhattan
164
+ value: 0.9436690128729379
165
+ name: Pearson Manhattan
166
+ - type: spearman_manhattan
167
+ value: 0.8604275677997159
168
+ name: Spearman Manhattan
169
+ - type: pearson_euclidean
170
+ value: 0.9443183012069103
171
+ name: Pearson Euclidean
172
+ - type: spearman_euclidean
173
+ value: 0.8605683342374743
174
+ name: Spearman Euclidean
175
+ - type: pearson_dot
176
+ value: 0.9594453207129489
177
+ name: Pearson Dot
178
+ - type: spearman_dot
179
+ value: 0.8605683341225518
180
+ name: Spearman Dot
181
+ - type: pearson_max
182
+ value: 0.9594453207129489
183
+ name: Pearson Max
184
+ - type: spearman_max
185
+ value: 0.8605683342374743
186
+ name: Spearman Max
187
+ ---
188
+
189
+ # SentenceTransformer based on sentence-transformers/multi-qa-MiniLM-L6-cos-v1
190
+
191
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
192
+
193
+ ## Model Details
194
+
195
+ ### Model Description
196
+ - **Model Type:** Sentence Transformer
197
+ - **Base model:** [sentence-transformers/multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) <!-- at revision 2430568290bb832d22ad5064f44dd86cf0240142 -->
198
+ - **Maximum Sequence Length:** 512 tokens
199
+ - **Output Dimensionality:** 384 tokens
200
+ - **Similarity Function:** Cosine Similarity
201
+ <!-- - **Training Dataset:** Unknown -->
202
+ <!-- - **Language:** Unknown -->
203
+ <!-- - **License:** Unknown -->
204
+
205
+ ### Model Sources
206
+
207
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
208
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
209
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
210
+
211
+ ### Full Model Architecture
212
+
213
+ ```
214
+ SentenceTransformer(
215
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
216
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
217
+ (2): Normalize()
218
+ )
219
+ ```
220
+
221
+ ## Usage
222
+
223
+ ### Direct Usage (Sentence Transformers)
224
+
225
+ First install the Sentence Transformers library:
226
+
227
+ ```bash
228
+ pip install -U sentence-transformers
229
+ ```
230
+
231
+ Then you can load this model and run inference.
232
+ ```python
233
+ from sentence_transformers import SentenceTransformer
234
+
235
+ # Download from the 🤗 Hub
236
+ model = SentenceTransformer("sentence_transformers_model_id")
237
+ # Run inference
238
+ sentences = [
239
+ 'Must have experience in software development',
240
+ "CEO of IMPROVLearning, specializing in e-learning and driver education. Founded and managed multiple ventures in training, healthcare, and real estate. Proven track record of expanding product offerings and achieving recognition on the Inc 500/5000 list. Active board member of the LA Chapter of the Entrepreneur Organization, contributing to the growth of over 3 million students. Experienced in venture capital and entrepreneurship, with a focus on innovative training solutions and community engagement. Active member of various organizations, including the Entrepreneurs' Organization and the Los Angeles County Business Federation.",
241
+ 'Multi-skilled Business Analytics professional with a Master’s in Business Analytics and a dual MBA. Experienced in data analytics, predictive modeling, and project management within the health and wellness sector. Proficient in extracting, summarizing, and analyzing claims databases and healthcare analytics. Skilled in statistical analysis, database management, and data visualization. Previous roles include Business Analytics Advisor at Cigna Healthcare and Informatics Senior Specialist at Cigna Healthcare. Strong leadership and project management abilities, with a solid foundation in healthcare economics and outcomes observational research. Familiar with Base SAS 9.2, SAS EG, SAS EM, SAS JMP, Tableau, and Oracle Crystal Ball.',
242
+ ]
243
+ embeddings = model.encode(sentences)
244
+ print(embeddings.shape)
245
+ # [3, 384]
246
+
247
+ # Get the similarity scores for the embeddings
248
+ similarities = model.similarity(embeddings, embeddings)
249
+ print(similarities.shape)
250
+ # [3, 3]
251
+ ```
252
+
253
+ <!--
254
+ ### Direct Usage (Transformers)
255
+
256
+ <details><summary>Click to see the direct usage in Transformers</summary>
257
+
258
+ </details>
259
+ -->
260
+
261
+ <!--
262
+ ### Downstream Usage (Sentence Transformers)
263
+
264
+ You can finetune this model on your own dataset.
265
+
266
+ <details><summary>Click to expand</summary>
267
+
268
+ </details>
269
+ -->
270
+
271
+ <!--
272
+ ### Out-of-Scope Use
273
+
274
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
275
+ -->
276
+
277
+ ## Evaluation
278
+
279
+ ### Metrics
280
+
281
+ #### Semantic Similarity
282
+ * Dataset: `validation`
283
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
284
+
285
+ | Metric | Value |
286
+ |:-------------------|:-----------|
287
+ | pearson_cosine | 0.9594 |
288
+ | spearman_cosine | 0.8606 |
289
+ | pearson_manhattan | 0.9437 |
290
+ | spearman_manhattan | 0.8604 |
291
+ | pearson_euclidean | 0.9443 |
292
+ | spearman_euclidean | 0.8606 |
293
+ | pearson_dot | 0.9594 |
294
+ | spearman_dot | 0.8606 |
295
+ | pearson_max | 0.9594 |
296
+ | **spearman_max** | **0.8606** |
297
+
298
+ <!--
299
+ ## Bias, Risks and Limitations
300
+
301
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
302
+ -->
303
+
304
+ <!--
305
+ ### Recommendations
306
+
307
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
308
+ -->
309
+
310
+ ## Training Details
311
+
312
+ ### Training Dataset
313
+
314
+ #### Unnamed Dataset
315
+
316
+
317
+ * Size: 3,192,024 training samples
318
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
319
+ * Approximate statistics based on the first 1000 samples:
320
+ | | sentence_0 | sentence_1 | label |
321
+ |:--------|:---------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:--------------------------------------------------------------|
322
+ | type | string | string | float |
323
+ | details | <ul><li>min: 6 tokens</li><li>mean: 9.15 tokens</li><li>max: 17 tokens</li></ul> | <ul><li>min: 53 tokens</li><li>mean: 93.6 tokens</li><li>max: 150 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.5</li><li>max: 1.0</li></ul> |
324
+ * Samples:
325
+ | sentence_0 | sentence_1 | label |
326
+ |:----------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
327
+ | <code>Must have experience in software development</code> | <code>Executive Assistant with a strong background in real estate and financial services. Experienced in managing executive schedules, coordinating communications, and supporting investment banking operations. Proficient in office management software and adept at multitasking in fast-paced environments. Previous roles at Blackstone, Piper Sandler, and Broe Real Estate Group, where responsibilities included supporting high-level executives and enhancing operational efficiency. Skilled in fostering relationships and facilitating smooth transitions in fast-paced settings.</code> | <code>0.0</code> |
328
+ | <code>Must have experience in overseeing service delivery for health initiatives</code> | <code>Director of Solution Strategy in health, wellness, and fitness, specializing in relationship building and strategy execution. Experienced in overseeing service delivery and performance management for telehealth and digital health initiatives at Blue Cross Blue Shield of Massachusetts. Proven track record in vendor lifecycle management, contract strategy, and operational leadership. Skilled in developing standardized wellness programs and enhancing client satisfaction through innovative solutions. Strong background in managing cross-functional teams and driving performance metrics in health engagement and wellness services.</code> | <code>1.0</code> |
329
+ | <code>Must have experience collaborating with Fortune 500 companies</code> | <code>Senior Sales and Business Development Manager in the energy sector, specializing in increasing profitable sales for small to large companies. Proven track record in relationship building, team management, and strategy development. Experienced in collaborating with diverse stakeholders, including Fortune 500 companies and small to large privately held companies. Previous roles include Vice President of Operations at NovaStar LP and Director of Sales at NovaStar Mortgage and Athlon Solutions. Strong communicator and team player, with a focus on customer needs and operational efficiency.</code> | <code>1.0</code> |
330
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
331
+ ```json
332
+ {
333
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
334
+ }
335
+ ```
336
+
337
+ ### Training Hyperparameters
338
+ #### Non-Default Hyperparameters
339
+
340
+ - `eval_strategy`: steps
341
+ - `per_device_train_batch_size`: 128
342
+ - `per_device_eval_batch_size`: 128
343
+ - `num_train_epochs`: 1.0
344
+ - `multi_dataset_batch_sampler`: round_robin
345
+
346
+ #### All Hyperparameters
347
+ <details><summary>Click to expand</summary>
348
+
349
+ - `overwrite_output_dir`: False
350
+ - `do_predict`: False
351
+ - `eval_strategy`: steps
352
+ - `prediction_loss_only`: True
353
+ - `per_device_train_batch_size`: 128
354
+ - `per_device_eval_batch_size`: 128
355
+ - `per_gpu_train_batch_size`: None
356
+ - `per_gpu_eval_batch_size`: None
357
+ - `gradient_accumulation_steps`: 1
358
+ - `eval_accumulation_steps`: None
359
+ - `torch_empty_cache_steps`: None
360
+ - `learning_rate`: 5e-05
361
+ - `weight_decay`: 0.0
362
+ - `adam_beta1`: 0.9
363
+ - `adam_beta2`: 0.999
364
+ - `adam_epsilon`: 1e-08
365
+ - `max_grad_norm`: 1
366
+ - `num_train_epochs`: 1.0
367
+ - `max_steps`: -1
368
+ - `lr_scheduler_type`: linear
369
+ - `lr_scheduler_kwargs`: {}
370
+ - `warmup_ratio`: 0.0
371
+ - `warmup_steps`: 0
372
+ - `log_level`: passive
373
+ - `log_level_replica`: warning
374
+ - `log_on_each_node`: True
375
+ - `logging_nan_inf_filter`: True
376
+ - `save_safetensors`: True
377
+ - `save_on_each_node`: False
378
+ - `save_only_model`: False
379
+ - `restore_callback_states_from_checkpoint`: False
380
+ - `no_cuda`: False
381
+ - `use_cpu`: False
382
+ - `use_mps_device`: False
383
+ - `seed`: 42
384
+ - `data_seed`: None
385
+ - `jit_mode_eval`: False
386
+ - `use_ipex`: False
387
+ - `bf16`: False
388
+ - `fp16`: False
389
+ - `fp16_opt_level`: O1
390
+ - `half_precision_backend`: auto
391
+ - `bf16_full_eval`: False
392
+ - `fp16_full_eval`: False
393
+ - `tf32`: None
394
+ - `local_rank`: 0
395
+ - `ddp_backend`: None
396
+ - `tpu_num_cores`: None
397
+ - `tpu_metrics_debug`: False
398
+ - `debug`: []
399
+ - `dataloader_drop_last`: False
400
+ - `dataloader_num_workers`: 0
401
+ - `dataloader_prefetch_factor`: None
402
+ - `past_index`: -1
403
+ - `disable_tqdm`: False
404
+ - `remove_unused_columns`: True
405
+ - `label_names`: None
406
+ - `load_best_model_at_end`: False
407
+ - `ignore_data_skip`: False
408
+ - `fsdp`: []
409
+ - `fsdp_min_num_params`: 0
410
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
411
+ - `fsdp_transformer_layer_cls_to_wrap`: None
412
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
413
+ - `deepspeed`: None
414
+ - `label_smoothing_factor`: 0.0
415
+ - `optim`: adamw_torch
416
+ - `optim_args`: None
417
+ - `adafactor`: False
418
+ - `group_by_length`: False
419
+ - `length_column_name`: length
420
+ - `ddp_find_unused_parameters`: None
421
+ - `ddp_bucket_cap_mb`: None
422
+ - `ddp_broadcast_buffers`: False
423
+ - `dataloader_pin_memory`: True
424
+ - `dataloader_persistent_workers`: False
425
+ - `skip_memory_metrics`: True
426
+ - `use_legacy_prediction_loop`: False
427
+ - `push_to_hub`: False
428
+ - `resume_from_checkpoint`: None
429
+ - `hub_model_id`: None
430
+ - `hub_strategy`: every_save
431
+ - `hub_private_repo`: False
432
+ - `hub_always_push`: False
433
+ - `gradient_checkpointing`: False
434
+ - `gradient_checkpointing_kwargs`: None
435
+ - `include_inputs_for_metrics`: False
436
+ - `eval_do_concat_batches`: True
437
+ - `fp16_backend`: auto
438
+ - `push_to_hub_model_id`: None
439
+ - `push_to_hub_organization`: None
440
+ - `mp_parameters`:
441
+ - `auto_find_batch_size`: False
442
+ - `full_determinism`: False
443
+ - `torchdynamo`: None
444
+ - `ray_scope`: last
445
+ - `ddp_timeout`: 1800
446
+ - `torch_compile`: False
447
+ - `torch_compile_backend`: None
448
+ - `torch_compile_mode`: None
449
+ - `dispatch_batches`: None
450
+ - `split_batches`: None
451
+ - `include_tokens_per_second`: False
452
+ - `include_num_input_tokens_seen`: False
453
+ - `neftune_noise_alpha`: None
454
+ - `optim_target_modules`: None
455
+ - `batch_eval_metrics`: False
456
+ - `eval_on_start`: False
457
+ - `eval_use_gather_object`: False
458
+ - `batch_sampler`: batch_sampler
459
+ - `multi_dataset_batch_sampler`: round_robin
460
+
461
+ </details>
462
+
463
+ ### Training Logs
464
+ | Epoch | Step | Training Loss | validation_spearman_max |
465
+ |:------:|:-----:|:-------------:|:-----------------------:|
466
+ | 0.0200 | 500 | 0.1362 | - |
467
+ | 0.0401 | 1000 | 0.0533 | - |
468
+ | 0.0601 | 1500 | 0.0433 | - |
469
+ | 0.0802 | 2000 | 0.0386 | - |
470
+ | 0.1002 | 2500 | 0.0356 | - |
471
+ | 0.1203 | 3000 | 0.0345 | - |
472
+ | 0.1403 | 3500 | 0.0326 | - |
473
+ | 0.1604 | 4000 | 0.0323 | - |
474
+ | 0.1804 | 4500 | 0.0313 | - |
475
+ | 0.2005 | 5000 | 0.0305 | - |
476
+ | 0.2205 | 5500 | 0.0298 | - |
477
+ | 0.2406 | 6000 | 0.0296 | - |
478
+ | 0.2606 | 6500 | 0.0291 | - |
479
+ | 0.2807 | 7000 | 0.0286 | - |
480
+ | 0.3007 | 7500 | 0.0286 | - |
481
+ | 0.3208 | 8000 | 0.0281 | - |
482
+ | 0.3408 | 8500 | 0.0278 | - |
483
+ | 0.3609 | 9000 | 0.0273 | - |
484
+ | 0.3809 | 9500 | 0.0276 | - |
485
+ | 0.4010 | 10000 | 0.0274 | - |
486
+ | 0.4210 | 10500 | 0.0266 | - |
487
+ | 0.4411 | 11000 | 0.0261 | - |
488
+ | 0.4611 | 11500 | 0.0264 | - |
489
+ | 0.4812 | 12000 | 0.0256 | - |
490
+ | 0.5012 | 12500 | 0.0254 | - |
491
+ | 0.5213 | 13000 | 0.0251 | - |
492
+ | 0.5413 | 13500 | 0.0249 | - |
493
+ | 0.5614 | 14000 | 0.0253 | - |
494
+ | 0.5814 | 14500 | 0.0247 | - |
495
+ | 0.6015 | 15000 | 0.0254 | - |
496
+ | 0.6215 | 15500 | 0.0246 | - |
497
+ | 0.6416 | 16000 | 0.0251 | - |
498
+ | 0.6616 | 16500 | 0.0248 | - |
499
+ | 0.6817 | 17000 | 0.0247 | - |
500
+ | 0.7017 | 17500 | 0.0246 | - |
501
+ | 0.7218 | 18000 | 0.0242 | - |
502
+ | 0.7418 | 18500 | 0.024 | - |
503
+ | 0.7619 | 19000 | 0.0247 | - |
504
+ | 0.7819 | 19500 | 0.0238 | - |
505
+ | 0.8020 | 20000 | 0.0244 | 0.8603 |
506
+ | 0.8220 | 20500 | 0.024 | - |
507
+ | 0.8421 | 21000 | 0.0244 | - |
508
+ | 0.8621 | 21500 | 0.0242 | - |
509
+ | 0.8822 | 22000 | 0.0239 | - |
510
+ | 0.9022 | 22500 | 0.0237 | - |
511
+ | 0.9223 | 23000 | 0.0241 | - |
512
+ | 0.9423 | 23500 | 0.0242 | - |
513
+ | 0.9624 | 24000 | 0.0238 | - |
514
+ | 0.9824 | 24500 | 0.0236 | - |
515
+ | 1.0 | 24938 | - | 0.8606 |
516
+
517
+
518
+ ### Framework Versions
519
+ - Python: 3.11.6
520
+ - Sentence Transformers: 3.0.1
521
+ - Transformers: 4.44.1
522
+ - PyTorch: 2.4.0+cu121
523
+ - Accelerate: 0.33.0
524
+ - Datasets: 2.21.0
525
+ - Tokenizers: 0.19.1
526
+
527
+ ## Citation
528
+
529
+ ### BibTeX
530
+
531
+ #### Sentence Transformers
532
+ ```bibtex
533
+ @inproceedings{reimers-2019-sentence-bert,
534
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
535
+ author = "Reimers, Nils and Gurevych, Iryna",
536
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
537
+ month = "11",
538
+ year = "2019",
539
+ publisher = "Association for Computational Linguistics",
540
+ url = "https://arxiv.org/abs/1908.10084",
541
+ }
542
+ ```
543
+
544
+ <!--
545
+ ## Glossary
546
+
547
+ *Clearly define terms in order to be accessible across audiences.*
548
+ -->
549
+
550
+ <!--
551
+ ## Model Card Authors
552
+
553
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
554
+ -->
555
+
556
+ <!--
557
+ ## Model Card Contact
558
+
559
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
560
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/multi-qa-MiniLM-L6-cos-v1",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.44.1",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.44.1",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ea3b246feacea31caad190a4de120ddd4979dc82af0f8239e3dbbb562585460
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 250,
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff