Prashasst commited on
Commit
ac256f9
·
verified ·
1 Parent(s): e501104

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,591 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/all-mpnet-base-v2
3
+ library_name: sentence-transformers
4
+ metrics:
5
+ - pearson_cosine
6
+ - spearman_cosine
7
+ pipeline_tag: sentence-similarity
8
+ tags:
9
+ - sentence-transformers
10
+ - sentence-similarity
11
+ - feature-extraction
12
+ - generated_from_trainer
13
+ - dataset_size:2353
14
+ - loss:CosineSimilarityLoss
15
+ widget:
16
+ - source_sentence: A year has passed since "The Black Rebellion" and the remaining
17
+ Black Knights have vanished into the shadows, their leader and figurehead, Zero,
18
+ executed by the Britannian Empire. Area 11 is once more squirming under the Emperors
19
+ oppressive heel as the Britannian armies concentrate their attacks on the European
20
+ front. But for the Britannians living in Area 11, life is back to normal. On one
21
+ such normal day, a Britannian student, skipping his classes in the Ashford Academy,
22
+ sneaks out to gamble on his chess play. But unknown to this young man, several
23
+ forces are eying him from the shadows, for soon, he will experience a shocking
24
+ encounter with his own obscured past, and the masked rebel mastermind Zero will
25
+ return.
26
+ sentences:
27
+ - Politics
28
+ - Mythology
29
+ - Disability
30
+ - source_sentence: 'In a land where corruption rules and a ruthless Prime Minister
31
+ has turned the puppet Emperors armies of soldiers, assassins and secret police
32
+ against the people, only one force dares to stand against them: Night Raid, an
33
+ elite team of relentless killers, each equipped with an Imperial Arm - legendary
34
+ weapons with unique and incredible powers created in the distant past.'
35
+ sentences:
36
+ - Kuudere
37
+ - Tragedy
38
+ - Seinen
39
+ - source_sentence: Theres a rumor about a mysterious phenomenon called "puberty syndrome."
40
+ For example, Sakuta Azusagawa is a high school student who suddenly sees a bunny
41
+ girl appear in front of him. The girl is actually a girl named Mai Sakurajima,
42
+ who is Sakutas upperclassman who is also a famous actress who has gone on hiatus
43
+ from the entertainment industry. For some reason, the people around Mai cannot
44
+ see her bunny-girl figure. Sakuta sets out to solve this mystery, and as he spends
45
+ time with Mai, he learns her secret feelings. Other heroines who have "puberty
46
+ syndrome" start to appear in front of Sakuta.
47
+ sentences:
48
+ - Heterosexual
49
+ - Drama
50
+ - Episodic
51
+ - source_sentence: Dororo, a young orphan thief, meets Hyakkimaru, a powerful ronin.
52
+ Hyakkimarus father, a greedy feudal lord, had made a pact with 12 demons, offering
53
+ his yet-unborn sons body parts in exchange for great power. Thus, Hyakkimaru -
54
+ who was born without arms, legs, eyes, ears, a nose or a mouth - was abandoned
55
+ in a river as a baby. Rescued and raised by Dr. Honma, who equips him with artificial
56
+ limbs and teaches him sword-fighting techniques, Hyakkimaru discovers that each
57
+ time he slays a demon, a piece of his body is restored. Now, he roams the war-torn
58
+ countryside in search of demons.
59
+ sentences:
60
+ - Urban
61
+ - Heterosexual
62
+ - Demons
63
+ - source_sentence: Everyone has a part of themselves they cannot show to anyone else.
64
+ sentences:
65
+ - Transgender
66
+ - Crime
67
+ - Comedy
68
+ model-index:
69
+ - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
70
+ results:
71
+ - task:
72
+ type: semantic-similarity
73
+ name: Semantic Similarity
74
+ dataset:
75
+ name: anime recommendation dev
76
+ type: anime-recommendation-dev
77
+ metrics:
78
+ - type: pearson_cosine
79
+ value: 0.6144532877889222
80
+ name: Pearson Cosine
81
+ - type: spearman_cosine
82
+ value: 0.6215240802205049
83
+ name: Spearman Cosine
84
+ - task:
85
+ type: semantic-similarity
86
+ name: Semantic Similarity
87
+ dataset:
88
+ name: anime recommendation test
89
+ type: anime-recommendation-test
90
+ metrics:
91
+ - type: pearson_cosine
92
+ value: 0.6535704432727567
93
+ name: Pearson Cosine
94
+ - type: spearman_cosine
95
+ value: 0.6393952594394526
96
+ name: Spearman Cosine
97
+ ---
98
+
99
+ # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
100
+
101
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
102
+
103
+ ## Model Details
104
+
105
+ ### Model Description
106
+ - **Model Type:** Sentence Transformer
107
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 9a3225965996d404b775526de6dbfe85d3368642 -->
108
+ - **Maximum Sequence Length:** 384 tokens
109
+ - **Output Dimensionality:** 768 dimensions
110
+ - **Similarity Function:** Cosine Similarity
111
+ <!-- - **Training Dataset:** Unknown -->
112
+ <!-- - **Language:** Unknown -->
113
+ <!-- - **License:** Unknown -->
114
+
115
+ ### Model Sources
116
+
117
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
118
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
119
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
120
+
121
+ ### Full Model Architecture
122
+
123
+ ```
124
+ SentenceTransformer(
125
+ (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
126
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
127
+ (2): Normalize()
128
+ )
129
+ ```
130
+
131
+ ## Usage
132
+
133
+ ### Direct Usage (Sentence Transformers)
134
+
135
+ First install the Sentence Transformers library:
136
+
137
+ ```bash
138
+ pip install -U sentence-transformers
139
+ ```
140
+
141
+ Then you can load this model and run inference.
142
+ ```python
143
+ from sentence_transformers import SentenceTransformer
144
+
145
+ # Download from the 🤗 Hub
146
+ model = SentenceTransformer("Prashasst/anime-recommendation-model")
147
+ # Run inference
148
+ sentences = [
149
+ 'Everyone has a part of themselves they cannot show to anyone else.',
150
+ 'Crime',
151
+ 'Comedy',
152
+ ]
153
+ embeddings = model.encode(sentences)
154
+ print(embeddings.shape)
155
+ # [3, 768]
156
+
157
+ # Get the similarity scores for the embeddings
158
+ similarities = model.similarity(embeddings, embeddings)
159
+ print(similarities.shape)
160
+ # [3, 3]
161
+ ```
162
+
163
+ <!--
164
+ ### Direct Usage (Transformers)
165
+
166
+ <details><summary>Click to see the direct usage in Transformers</summary>
167
+
168
+ </details>
169
+ -->
170
+
171
+ <!--
172
+ ### Downstream Usage (Sentence Transformers)
173
+
174
+ You can finetune this model on your own dataset.
175
+
176
+ <details><summary>Click to expand</summary>
177
+
178
+ </details>
179
+ -->
180
+
181
+ <!--
182
+ ### Out-of-Scope Use
183
+
184
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
185
+ -->
186
+
187
+ ## Evaluation
188
+
189
+ ### Metrics
190
+
191
+ #### Semantic Similarity
192
+
193
+ * Datasets: `anime-recommendation-dev` and `anime-recommendation-test`
194
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
195
+
196
+ | Metric | anime-recommendation-dev | anime-recommendation-test |
197
+ |:--------------------|:-------------------------|:--------------------------|
198
+ | pearson_cosine | 0.6145 | 0.6536 |
199
+ | **spearman_cosine** | **0.6215** | **0.6394** |
200
+
201
+ <!--
202
+ ## Bias, Risks and Limitations
203
+
204
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
205
+ -->
206
+
207
+ <!--
208
+ ### Recommendations
209
+
210
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
211
+ -->
212
+
213
+ ## Training Details
214
+
215
+ ### Training Dataset
216
+
217
+ #### Unnamed Dataset
218
+
219
+
220
+ * Size: 2,353 training samples
221
+ * Columns: <code>description</code>, <code>genre</code>, and <code>label</code>
222
+ * Approximate statistics based on the first 1000 samples:
223
+ | | description | genre | label |
224
+ |:--------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
225
+ | type | string | string | float |
226
+ | details | <ul><li>min: 15 tokens</li><li>mean: 97.39 tokens</li><li>max: 193 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 3.82 tokens</li><li>max: 8 tokens</li></ul> | <ul><li>min: 0.1</li><li>mean: 0.71</li><li>max: 1.0</li></ul> |
227
+ * Samples:
228
+ | description | genre | label |
229
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------|:------------------|
230
+ | <code>Mitsuha Miyamizu, a high school girl, yearns to live the life of a boy in the bustling city of Tokyo—a dream that stands in stark contrast to her present life in the countryside. Meanwhile in the city, Taki Tachibana lives a busy life as a high school student while juggling his part-time job and hopes for a future in architecture.</code> | <code>Environmental</code> | <code>0.6</code> |
231
+ | <code>Jinta Yadomi and his group of childhood friends have become estranged after a tragic accident split them apart. Now in their high school years, a sudden surprise forces each of them to confront their guilt over what happened that day and come to terms with the ghosts of their past.</code> | <code>Hikikomori</code> | <code>0.79</code> |
232
+ | <code>The second season of <i>Ansatsu Kyoushitsu</i>.</code> | <code>Episodic</code> | <code>0.44</code> |
233
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
234
+ ```json
235
+ {
236
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
237
+ }
238
+ ```
239
+
240
+ ### Evaluation Dataset
241
+
242
+ #### Unnamed Dataset
243
+
244
+
245
+ * Size: 294 evaluation samples
246
+ * Columns: <code>description</code>, <code>genre</code>, and <code>label</code>
247
+ * Approximate statistics based on the first 294 samples:
248
+ | | description | genre | label |
249
+ |:--------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
250
+ | type | string | string | float |
251
+ | details | <ul><li>min: 15 tokens</li><li>mean: 92.48 tokens</li><li>max: 193 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 3.73 tokens</li><li>max: 8 tokens</li></ul> | <ul><li>min: 0.1</li><li>mean: 0.69</li><li>max: 1.0</li></ul> |
252
+ * Samples:
253
+ | description | genre | label |
254
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------|:------------------|
255
+ | <code>Summer is here, and the heroes of Class 1-A and 1-B are in for the toughest training camp of their lives A group of seasoned pros pushes everyones Quirks to new heights as the students face one overwhelming challenge after another. Braving the elements in this secret location becomes the least of their worries when routine training turns into a critical struggle for survival.</code> | <code>Transgender</code> | <code>0.2</code> |
256
+ | <code>"In order for something to be obtained, something of equal value must be lost."</code> | <code>Cyborg</code> | <code>0.72</code> |
257
+ | <code>In the story, Subaru Natsuki is an ordinary high school student who is lost in an alternate world, where he is rescued by a beautiful, silver-haired girl. He stays near her to return the favor, but the destiny she is burdened with is more than Subaru can imagine. Enemies attack one by one, and both of them are killed. He then finds out he has the power to rewind death, back to the time he first came to this world. But only he remembers what has happened since.</code> | <code>Primarily Female Cast</code> | <code>0.61</code> |
258
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
259
+ ```json
260
+ {
261
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
262
+ }
263
+ ```
264
+
265
+ ### Training Hyperparameters
266
+ #### Non-Default Hyperparameters
267
+
268
+ - `eval_strategy`: steps
269
+ - `per_device_train_batch_size`: 16
270
+ - `learning_rate`: 2e-05
271
+ - `num_train_epochs`: 1
272
+ - `warmup_ratio`: 0.1
273
+ - `fp16`: True
274
+
275
+ #### All Hyperparameters
276
+ <details><summary>Click to expand</summary>
277
+
278
+ - `overwrite_output_dir`: False
279
+ - `do_predict`: False
280
+ - `eval_strategy`: steps
281
+ - `prediction_loss_only`: True
282
+ - `per_device_train_batch_size`: 16
283
+ - `per_device_eval_batch_size`: 8
284
+ - `per_gpu_train_batch_size`: None
285
+ - `per_gpu_eval_batch_size`: None
286
+ - `gradient_accumulation_steps`: 1
287
+ - `eval_accumulation_steps`: None
288
+ - `torch_empty_cache_steps`: None
289
+ - `learning_rate`: 2e-05
290
+ - `weight_decay`: 0.0
291
+ - `adam_beta1`: 0.9
292
+ - `adam_beta2`: 0.999
293
+ - `adam_epsilon`: 1e-08
294
+ - `max_grad_norm`: 1.0
295
+ - `num_train_epochs`: 1
296
+ - `max_steps`: -1
297
+ - `lr_scheduler_type`: linear
298
+ - `lr_scheduler_kwargs`: {}
299
+ - `warmup_ratio`: 0.1
300
+ - `warmup_steps`: 0
301
+ - `log_level`: passive
302
+ - `log_level_replica`: warning
303
+ - `log_on_each_node`: True
304
+ - `logging_nan_inf_filter`: True
305
+ - `save_safetensors`: True
306
+ - `save_on_each_node`: False
307
+ - `save_only_model`: False
308
+ - `restore_callback_states_from_checkpoint`: False
309
+ - `no_cuda`: False
310
+ - `use_cpu`: False
311
+ - `use_mps_device`: False
312
+ - `seed`: 42
313
+ - `data_seed`: None
314
+ - `jit_mode_eval`: False
315
+ - `use_ipex`: False
316
+ - `bf16`: False
317
+ - `fp16`: True
318
+ - `fp16_opt_level`: O1
319
+ - `half_precision_backend`: auto
320
+ - `bf16_full_eval`: False
321
+ - `fp16_full_eval`: False
322
+ - `tf32`: None
323
+ - `local_rank`: 0
324
+ - `ddp_backend`: None
325
+ - `tpu_num_cores`: None
326
+ - `tpu_metrics_debug`: False
327
+ - `debug`: []
328
+ - `dataloader_drop_last`: False
329
+ - `dataloader_num_workers`: 0
330
+ - `dataloader_prefetch_factor`: None
331
+ - `past_index`: -1
332
+ - `disable_tqdm`: False
333
+ - `remove_unused_columns`: True
334
+ - `label_names`: None
335
+ - `load_best_model_at_end`: False
336
+ - `ignore_data_skip`: False
337
+ - `fsdp`: []
338
+ - `fsdp_min_num_params`: 0
339
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
340
+ - `fsdp_transformer_layer_cls_to_wrap`: None
341
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
342
+ - `deepspeed`: None
343
+ - `label_smoothing_factor`: 0.0
344
+ - `optim`: adamw_torch
345
+ - `optim_args`: None
346
+ - `adafactor`: False
347
+ - `group_by_length`: False
348
+ - `length_column_name`: length
349
+ - `ddp_find_unused_parameters`: None
350
+ - `ddp_bucket_cap_mb`: None
351
+ - `ddp_broadcast_buffers`: False
352
+ - `dataloader_pin_memory`: True
353
+ - `dataloader_persistent_workers`: False
354
+ - `skip_memory_metrics`: True
355
+ - `use_legacy_prediction_loop`: False
356
+ - `push_to_hub`: False
357
+ - `resume_from_checkpoint`: None
358
+ - `hub_model_id`: None
359
+ - `hub_strategy`: every_save
360
+ - `hub_private_repo`: False
361
+ - `hub_always_push`: False
362
+ - `gradient_checkpointing`: False
363
+ - `gradient_checkpointing_kwargs`: None
364
+ - `include_inputs_for_metrics`: False
365
+ - `eval_do_concat_batches`: True
366
+ - `fp16_backend`: auto
367
+ - `push_to_hub_model_id`: None
368
+ - `push_to_hub_organization`: None
369
+ - `mp_parameters`:
370
+ - `auto_find_batch_size`: False
371
+ - `full_determinism`: False
372
+ - `torchdynamo`: None
373
+ - `ray_scope`: last
374
+ - `ddp_timeout`: 1800
375
+ - `torch_compile`: False
376
+ - `torch_compile_backend`: None
377
+ - `torch_compile_mode`: None
378
+ - `dispatch_batches`: None
379
+ - `split_batches`: None
380
+ - `include_tokens_per_second`: False
381
+ - `include_num_input_tokens_seen`: False
382
+ - `neftune_noise_alpha`: None
383
+ - `optim_target_modules`: None
384
+ - `batch_eval_metrics`: False
385
+ - `eval_on_start`: False
386
+ - `eval_use_gather_object`: False
387
+ - `prompts`: None
388
+ - `batch_sampler`: batch_sampler
389
+ - `multi_dataset_batch_sampler`: proportional
390
+
391
+ </details>
392
+
393
+ ### Training Logs
394
+ <details><summary>Click to expand</summary>
395
+
396
+ | Epoch | Step | Training Loss | Validation Loss | anime-recommendation-dev_spearman_cosine | anime-recommendation-test_spearman_cosine |
397
+ |:------:|:----:|:-------------:|:---------------:|:----------------------------------------:|:-----------------------------------------:|
398
+ | 0.0068 | 1 | 0.3882 | - | - | - |
399
+ | 0.0135 | 2 | 0.2697 | - | - | - |
400
+ | 0.0203 | 3 | 0.2648 | - | - | - |
401
+ | 0.0270 | 4 | 0.3022 | - | - | - |
402
+ | 0.0338 | 5 | 0.2665 | - | - | - |
403
+ | 0.0405 | 6 | 0.2923 | - | - | - |
404
+ | 0.0473 | 7 | 0.3165 | - | - | - |
405
+ | 0.0541 | 8 | 0.2069 | - | - | - |
406
+ | 0.0608 | 9 | 0.271 | - | - | - |
407
+ | 0.0676 | 10 | 0.1974 | - | - | - |
408
+ | 0.0743 | 11 | 0.156 | - | - | - |
409
+ | 0.0811 | 12 | 0.1035 | - | - | - |
410
+ | 0.0878 | 13 | 0.1046 | - | - | - |
411
+ | 0.0946 | 14 | 0.0579 | - | - | - |
412
+ | 0.1014 | 15 | 0.0904 | - | - | - |
413
+ | 0.1081 | 16 | 0.0734 | - | - | - |
414
+ | 0.1149 | 17 | 0.0396 | - | - | - |
415
+ | 0.1216 | 18 | 0.0219 | - | - | - |
416
+ | 0.1284 | 19 | 0.0672 | - | - | - |
417
+ | 0.1351 | 20 | 0.0567 | - | - | - |
418
+ | 0.1419 | 21 | 0.0969 | - | - | - |
419
+ | 0.1486 | 22 | 0.0258 | - | - | - |
420
+ | 0.1554 | 23 | 0.1174 | - | - | - |
421
+ | 0.1622 | 24 | 0.0334 | - | - | - |
422
+ | 0.1689 | 25 | 0.0661 | - | - | - |
423
+ | 0.1757 | 26 | 0.0365 | - | - | - |
424
+ | 0.1824 | 27 | 0.049 | - | - | - |
425
+ | 0.1892 | 28 | 0.0889 | - | - | - |
426
+ | 0.1959 | 29 | 0.0179 | - | - | - |
427
+ | 0.2027 | 30 | 0.0255 | - | - | - |
428
+ | 0.2095 | 31 | 0.0312 | - | - | - |
429
+ | 0.2162 | 32 | 0.0312 | - | - | - |
430
+ | 0.2230 | 33 | 0.0619 | - | - | - |
431
+ | 0.2297 | 34 | 0.0358 | - | - | - |
432
+ | 0.2365 | 35 | 0.0468 | - | - | - |
433
+ | 0.2432 | 36 | 0.0601 | - | - | - |
434
+ | 0.25 | 37 | 0.0546 | - | - | - |
435
+ | 0.2568 | 38 | 0.0411 | - | - | - |
436
+ | 0.2635 | 39 | 0.0332 | - | - | - |
437
+ | 0.2703 | 40 | 0.0479 | - | - | - |
438
+ | 0.2770 | 41 | 0.0657 | - | - | - |
439
+ | 0.2838 | 42 | 0.0161 | - | - | - |
440
+ | 0.2905 | 43 | 0.0323 | - | - | - |
441
+ | 0.2973 | 44 | 0.0794 | - | - | - |
442
+ | 0.3041 | 45 | 0.0264 | - | - | - |
443
+ | 0.3108 | 46 | 0.0391 | - | - | - |
444
+ | 0.3176 | 47 | 0.0514 | - | - | - |
445
+ | 0.3243 | 48 | 0.0276 | - | - | - |
446
+ | 0.3311 | 49 | 0.0653 | - | - | - |
447
+ | 0.3378 | 50 | 0.0343 | - | - | - |
448
+ | 0.3446 | 51 | 0.0369 | - | - | - |
449
+ | 0.3514 | 52 | 0.0336 | - | - | - |
450
+ | 0.3581 | 53 | 0.0368 | - | - | - |
451
+ | 0.3649 | 54 | 0.0477 | - | - | - |
452
+ | 0.3716 | 55 | 0.0358 | - | - | - |
453
+ | 0.3784 | 56 | 0.0312 | - | - | - |
454
+ | 0.3851 | 57 | 0.0388 | - | - | - |
455
+ | 0.3919 | 58 | 0.0415 | - | - | - |
456
+ | 0.3986 | 59 | 0.02 | - | - | - |
457
+ | 0.4054 | 60 | 0.0459 | - | - | - |
458
+ | 0.4122 | 61 | 0.0302 | - | - | - |
459
+ | 0.4189 | 62 | 0.0519 | - | - | - |
460
+ | 0.4257 | 63 | 0.0283 | - | - | - |
461
+ | 0.4324 | 64 | 0.04 | - | - | - |
462
+ | 0.4392 | 65 | 0.0146 | - | - | - |
463
+ | 0.4459 | 66 | 0.033 | - | - | - |
464
+ | 0.4527 | 67 | 0.0365 | - | - | - |
465
+ | 0.4595 | 68 | 0.0579 | - | - | - |
466
+ | 0.4662 | 69 | 0.0253 | - | - | - |
467
+ | 0.4730 | 70 | 0.033 | - | - | - |
468
+ | 0.4797 | 71 | 0.0258 | - | - | - |
469
+ | 0.4865 | 72 | 0.0181 | - | - | - |
470
+ | 0.4932 | 73 | 0.0334 | - | - | - |
471
+ | 0.5 | 74 | 0.0415 | - | - | - |
472
+ | 0.5068 | 75 | 0.0258 | - | - | - |
473
+ | 0.5135 | 76 | 0.0304 | - | - | - |
474
+ | 0.5203 | 77 | 0.0211 | - | - | - |
475
+ | 0.5270 | 78 | 0.0334 | - | - | - |
476
+ | 0.5338 | 79 | 0.0278 | - | - | - |
477
+ | 0.5405 | 80 | 0.0209 | - | - | - |
478
+ | 0.5473 | 81 | 0.0391 | - | - | - |
479
+ | 0.5541 | 82 | 0.0274 | - | - | - |
480
+ | 0.5608 | 83 | 0.0213 | - | - | - |
481
+ | 0.5676 | 84 | 0.0293 | - | - | - |
482
+ | 0.5743 | 85 | 0.0205 | - | - | - |
483
+ | 0.5811 | 86 | 0.0258 | - | - | - |
484
+ | 0.5878 | 87 | 0.0262 | - | - | - |
485
+ | 0.5946 | 88 | 0.0109 | - | - | - |
486
+ | 0.6014 | 89 | 0.0268 | - | - | - |
487
+ | 0.6081 | 90 | 0.0304 | - | - | - |
488
+ | 0.6149 | 91 | 0.0328 | - | - | - |
489
+ | 0.6216 | 92 | 0.0173 | - | - | - |
490
+ | 0.6284 | 93 | 0.0253 | - | - | - |
491
+ | 0.6351 | 94 | 0.0245 | - | - | - |
492
+ | 0.6419 | 95 | 0.0232 | - | - | - |
493
+ | 0.6486 | 96 | 0.0309 | - | - | - |
494
+ | 0.6554 | 97 | 0.0209 | - | - | - |
495
+ | 0.6622 | 98 | 0.0169 | - | - | - |
496
+ | 0.6689 | 99 | 0.024 | - | - | - |
497
+ | 0.6757 | 100 | 0.0166 | 0.0284 | 0.6215 | - |
498
+ | 0.6824 | 101 | 0.0202 | - | - | - |
499
+ | 0.6892 | 102 | 0.0181 | - | - | - |
500
+ | 0.6959 | 103 | 0.0413 | - | - | - |
501
+ | 0.7027 | 104 | 0.0537 | - | - | - |
502
+ | 0.7095 | 105 | 0.0241 | - | - | - |
503
+ | 0.7162 | 106 | 0.0199 | - | - | - |
504
+ | 0.7230 | 107 | 0.0227 | - | - | - |
505
+ | 0.7297 | 108 | 0.0283 | - | - | - |
506
+ | 0.7365 | 109 | 0.0372 | - | - | - |
507
+ | 0.7432 | 110 | 0.0193 | - | - | - |
508
+ | 0.75 | 111 | 0.0147 | - | - | - |
509
+ | 0.7568 | 112 | 0.0594 | - | - | - |
510
+ | 0.7635 | 113 | 0.0185 | - | - | - |
511
+ | 0.7703 | 114 | 0.0674 | - | - | - |
512
+ | 0.7770 | 115 | 0.0212 | - | - | - |
513
+ | 0.7838 | 116 | 0.0268 | - | - | - |
514
+ | 0.7905 | 117 | 0.0233 | - | - | - |
515
+ | 0.7973 | 118 | 0.0276 | - | - | - |
516
+ | 0.8041 | 119 | 0.0242 | - | - | - |
517
+ | 0.8108 | 120 | 0.034 | - | - | - |
518
+ | 0.8176 | 121 | 0.0231 | - | - | - |
519
+ | 0.8243 | 122 | 0.0252 | - | - | - |
520
+ | 0.8311 | 123 | 0.0294 | - | - | - |
521
+ | 0.8378 | 124 | 0.0205 | - | - | - |
522
+ | 0.8446 | 125 | 0.0302 | - | - | - |
523
+ | 0.8514 | 126 | 0.0468 | - | - | - |
524
+ | 0.8581 | 127 | 0.0311 | - | - | - |
525
+ | 0.8649 | 128 | 0.0365 | - | - | - |
526
+ | 0.8716 | 129 | 0.0257 | - | - | - |
527
+ | 0.8784 | 130 | 0.0339 | - | - | - |
528
+ | 0.8851 | 131 | 0.0359 | - | - | - |
529
+ | 0.8919 | 132 | 0.0404 | - | - | - |
530
+ | 0.8986 | 133 | 0.0223 | - | - | - |
531
+ | 0.9054 | 134 | 0.0232 | - | - | - |
532
+ | 0.9122 | 135 | 0.0295 | - | - | - |
533
+ | 0.9189 | 136 | 0.0244 | - | - | - |
534
+ | 0.9257 | 137 | 0.0168 | - | - | - |
535
+ | 0.9324 | 138 | 0.0319 | - | - | - |
536
+ | 0.9392 | 139 | 0.0328 | - | - | - |
537
+ | 0.9459 | 140 | 0.0295 | - | - | - |
538
+ | 0.9527 | 141 | 0.0262 | - | - | - |
539
+ | 0.9595 | 142 | 0.0238 | - | - | - |
540
+ | 0.9662 | 143 | 0.0181 | - | - | - |
541
+ | 0.9730 | 144 | 0.017 | - | - | - |
542
+ | 0.9797 | 145 | 0.0244 | - | - | - |
543
+ | 0.9865 | 146 | 0.0264 | - | - | - |
544
+ | 0.9932 | 147 | 0.0194 | - | - | - |
545
+ | 1.0 | 148 | 0.0028 | - | - | 0.6394 |
546
+
547
+ </details>
548
+
549
+ ### Framework Versions
550
+ - Python: 3.10.12
551
+ - Sentence Transformers: 3.3.1
552
+ - Transformers: 4.44.2
553
+ - PyTorch: 2.4.1+cu121
554
+ - Accelerate: 0.34.2
555
+ - Datasets: 3.2.0
556
+ - Tokenizers: 0.19.1
557
+
558
+ ## Citation
559
+
560
+ ### BibTeX
561
+
562
+ #### Sentence Transformers
563
+ ```bibtex
564
+ @inproceedings{reimers-2019-sentence-bert,
565
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
566
+ author = "Reimers, Nils and Gurevych, Iryna",
567
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
568
+ month = "11",
569
+ year = "2019",
570
+ publisher = "Association for Computational Linguistics",
571
+ url = "https://arxiv.org/abs/1908.10084",
572
+ }
573
+ ```
574
+
575
+ <!--
576
+ ## Glossary
577
+
578
+ *Clearly define terms in order to be accessible across audiences.*
579
+ -->
580
+
581
+ <!--
582
+ ## Model Card Authors
583
+
584
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
585
+ -->
586
+
587
+ <!--
588
+ ## Model Card Contact
589
+
590
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
591
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.44.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.44.2",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1f456fd095937fb6c4f497fa88548364abc5f8b9c454eba74c70d48ed782abd
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 384,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": true,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 128,
59
+ "model_max_length": 384,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff