sitges2608bai-4ep / README.md
adriansanz's picture
Add new SentenceTransformer model.
4b93744 verified
metadata
base_model: BAAI/bge-m3
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:4173
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Si dins el termini que s'hagi atorgat amb aquesta finalitat els habitatges
      que en disposen no s'han adaptat, la llicència pot ésser revocada.
    sentences:
      - Qui pot sol·licitar la pròrroga de la prestació?
      - >-
        Quin és el resultat de la constatació dels fets denunciats per part de
        l'Ajuntament?
      - >-
        Què passa si no s'adapten els habitatges d'ús turístic dins el termini
        establert?
  - source_sentence: >-
      En cas que a la sepultura hi hagi despulles, la persona titular podrà
      triar entre traslladar-les a una altra sepultura de la què en sigui el/la
      titular o bé que l'Ajuntament les traslladi a l'ossera general.
    sentences:
      - >-
        Què passa amb les despulles si la persona titular decideix
        traslladar-les a una altra sepultura?
      - Quins són els beneficis de la llicència de publicitat dinàmica?
      - >-
        Quan es va aprovar els models d'aval per part de la Junta de Govern
        Local?
  - source_sentence: >-
      La colònia felina té un paper important en la reducció del nombre
      d'animals abandonats, ja que proporciona un refugi segur i un entorn
      adequat per als animals que es troben en situació de risc o abandonament.
    sentences:
      - >-
        Quin és el termini per justificar la realització del projecte/activitat
        subvencionada?
      - >-
        Quins són els tractaments mèdics que beneficien la salut de l'empleat
        municipal?
      - >-
        Quin és el paper de la colònia felina en la reducció del nombre
        d'animals abandonats?
  - source_sentence: >-
      La realització de les obres que s’indiquen a continuació està subjecta a
      l’obtenció d’una llicència d’obra major atorgada per l’Ajuntament: ...
      Compartimentació de naus industrials existents...
    sentences:
      - >-
        Quin tipus d’obra es refereix a la compartimentació de naus industrials
        existents?
      - >-
        Quin és el benefici principal del tràmit de canvi de titular de la
        llicència de gual?
      - >-
        Quin és el tipus de garantia que es pot fer mitjançant una assegurança
        de caució?
  - source_sentence: >-
      Els membres de la Corporació tenen dret a obtenir dels òrgans de
      l'Ajuntament les dades o informacions...
    sentences:
      - >-
        Quin és el paper dels òrgans de l'Ajuntament en relació amb les
        sol·licituds dels membres de la Corporació?
      - >-
        Quin és el motiu principal perquè un beneficiari pugui perdre el dret a
        una subvenció?
      - Quin és el benefici de la presentació de recursos?
model-index:
  - name: SentenceTransformer based on BAAI/bge-m3
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.07543103448275862
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.14439655172413793
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21336206896551724
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3900862068965517
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07543103448275862
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.048132183908045974
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04267241379310344
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.039008620689655174
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07543103448275862
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.14439655172413793
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21336206896551724
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.3900862068965517
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.19775448839983267
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.14087729200875768
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.1670966505747688
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.07543103448275862
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.1400862068965517
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.20905172413793102
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3922413793103448
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07543103448275862
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.046695402298850566
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04181034482758621
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03922413793103448
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07543103448275862
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.1400862068965517
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.20905172413793102
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.3922413793103448
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.1973388128367381
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.14006910235358525
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.1660059682423787
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.07112068965517242
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.14439655172413793
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.20905172413793102
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3793103448275862
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.07112068965517242
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.048132183908045974
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04181034482758621
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03793103448275861
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.07112068965517242
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.14439655172413793
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.20905172413793102
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.3793103448275862
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.19451734912520316
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.13957307060755345
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.1658323397622155
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.06465517241379311
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.13793103448275862
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.21336206896551724
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.3577586206896552
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.06465517241379311
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.04597701149425287
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.04267241379310345
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03577586206896552
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.06465517241379311
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.13793103448275862
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.21336206896551724
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.3577586206896552
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.18381656342161204
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.13181616037219498
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.15919561658705733
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.06896551724137931
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.13577586206896552
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.20905172413793102
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.35344827586206895
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.06896551724137931
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.04525862068965517
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.041810344827586214
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.03534482758620689
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.06896551724137931
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.13577586206896552
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.20905172413793102
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.35344827586206895
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.18256713591724985
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.131704980842912
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.1580121500031178
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("adriansanz/sitges2608bai-4ep")
# Run inference
sentences = [
    "Els membres de la Corporació tenen dret a obtenir dels òrgans de l'Ajuntament les dades o informacions...",
    "Quin és el paper dels òrgans de l'Ajuntament en relació amb les sol·licituds dels membres de la Corporació?",
    'Quin és el benefici de la presentació de recursos?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0754
cosine_accuracy@3 0.1444
cosine_accuracy@5 0.2134
cosine_accuracy@10 0.3901
cosine_precision@1 0.0754
cosine_precision@3 0.0481
cosine_precision@5 0.0427
cosine_precision@10 0.039
cosine_recall@1 0.0754
cosine_recall@3 0.1444
cosine_recall@5 0.2134
cosine_recall@10 0.3901
cosine_ndcg@10 0.1978
cosine_mrr@10 0.1409
cosine_map@100 0.1671

Information Retrieval

Metric Value
cosine_accuracy@1 0.0754
cosine_accuracy@3 0.1401
cosine_accuracy@5 0.2091
cosine_accuracy@10 0.3922
cosine_precision@1 0.0754
cosine_precision@3 0.0467
cosine_precision@5 0.0418
cosine_precision@10 0.0392
cosine_recall@1 0.0754
cosine_recall@3 0.1401
cosine_recall@5 0.2091
cosine_recall@10 0.3922
cosine_ndcg@10 0.1973
cosine_mrr@10 0.1401
cosine_map@100 0.166

Information Retrieval

Metric Value
cosine_accuracy@1 0.0711
cosine_accuracy@3 0.1444
cosine_accuracy@5 0.2091
cosine_accuracy@10 0.3793
cosine_precision@1 0.0711
cosine_precision@3 0.0481
cosine_precision@5 0.0418
cosine_precision@10 0.0379
cosine_recall@1 0.0711
cosine_recall@3 0.1444
cosine_recall@5 0.2091
cosine_recall@10 0.3793
cosine_ndcg@10 0.1945
cosine_mrr@10 0.1396
cosine_map@100 0.1658

Information Retrieval

Metric Value
cosine_accuracy@1 0.0647
cosine_accuracy@3 0.1379
cosine_accuracy@5 0.2134
cosine_accuracy@10 0.3578
cosine_precision@1 0.0647
cosine_precision@3 0.046
cosine_precision@5 0.0427
cosine_precision@10 0.0358
cosine_recall@1 0.0647
cosine_recall@3 0.1379
cosine_recall@5 0.2134
cosine_recall@10 0.3578
cosine_ndcg@10 0.1838
cosine_mrr@10 0.1318
cosine_map@100 0.1592

Information Retrieval

Metric Value
cosine_accuracy@1 0.069
cosine_accuracy@3 0.1358
cosine_accuracy@5 0.2091
cosine_accuracy@10 0.3534
cosine_precision@1 0.069
cosine_precision@3 0.0453
cosine_precision@5 0.0418
cosine_precision@10 0.0353
cosine_recall@1 0.069
cosine_recall@3 0.1358
cosine_recall@5 0.2091
cosine_recall@10 0.3534
cosine_ndcg@10 0.1826
cosine_mrr@10 0.1317
cosine_map@100 0.158

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,173 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 8 tokens
    • mean: 48.65 tokens
    • max: 125 tokens
    • min: 10 tokens
    • mean: 20.96 tokens
    • max: 45 tokens
  • Samples:
    positive anchor
    Quan es produeix la caducitat del dret funerari per haver transcorregut el termini de concessió i un cop que l'Ajuntament hagi resolt el procediment legalment establert per a la declaració de caducitat, és imprescindible formalitzar la nova concessió del dret. Quan es produeix la caducitat del dret funerari?
    Les persones beneficiàries de l'ajut per a la creació de noves empreses per persones donades d'alta al règim especial de treballadors autònoms. Quin és el tipus de persones que poden beneficiar-se de l'ajut?
    Les entitats beneficiàries són les responsables de la gestió dels recursos econòmics i materials assignats per a la realització del projecte o activitat subvencionat. Quin és el paper de les entitats beneficiàries en la gestió dels recursos?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.0096 10 0.4269 - - - - -
0.0192 20 0.2328 - - - - -
0.0287 30 0.2803 - - - - -
0.0383 40 0.312 - - - - -
0.0479 50 0.0631 - - - - -
0.0575 60 0.1824 - - - - -
0.0671 70 0.3102 - - - - -
0.0767 80 0.2966 - - - - -
0.0862 90 0.3715 - - - - -
0.0958 100 0.0719 - - - - -
0.1054 110 0.279 - - - - -
0.1150 120 0.0954 - - - - -
0.1246 130 0.4912 - - - - -
0.1342 140 0.2877 - - - - -
0.1437 150 0.1933 - - - - -
0.1533 160 0.5942 - - - - -
0.1629 170 0.1336 - - - - -
0.1725 180 0.1755 - - - - -
0.1821 190 0.1455 - - - - -
0.1917 200 0.4391 - - - - -
0.2012 210 0.0567 - - - - -
0.2108 220 0.2368 - - - - -
0.2204 230 0.0249 - - - - -
0.2300 240 0.0518 - - - - -
0.2396 250 0.015 - - - - -
0.2492 260 0.4096 - - - - -
0.2587 270 0.115 - - - - -
0.2683 280 0.0532 - - - - -
0.2779 290 0.0407 - - - - -
0.2875 300 0.082 - - - - -
0.2971 310 0.1086 - - - - -
0.3067 320 0.0345 - - - - -
0.3162 330 0.3144 - - - - -
0.3258 340 0.0056 - - - - -
0.3354 350 0.0867 - - - - -
0.3450 360 0.1011 - - - - -
0.3546 370 0.6417 - - - - -
0.3642 380 0.0689 - - - - -
0.3737 390 0.0075 - - - - -
0.3833 400 0.0822 - - - - -
0.3929 410 0.098 - - - - -
0.4025 420 0.0442 - - - - -
0.4121 430 0.1759 - - - - -
0.4217 440 0.2625 - - - - -
0.4312 450 0.1123 - - - - -
0.4408 460 0.1174 - - - - -
0.4504 470 0.0529 - - - - -
0.4600 480 0.5396 - - - - -
0.4696 490 0.1985 - - - - -
0.4792 500 0.0016 - - - - -
0.4887 510 0.0496 - - - - -
0.4983 520 0.3138 - - - - -
0.5079 530 0.1974 - - - - -
0.5175 540 0.3489 - - - - -
0.5271 550 0.3332 - - - - -
0.5367 560 0.7838 - - - - -
0.5462 570 0.8335 - - - - -
0.5558 580 0.5018 - - - - -
0.5654 590 0.3391 - - - - -
0.5750 600 0.0055 - - - - -
0.5846 610 0.0264 - - - - -
0.5942 620 0.1397 - - - - -
0.6037 630 0.1114 - - - - -
0.6133 640 0.337 - - - - -
0.6229 650 0.0027 - - - - -
0.6325 660 0.1454 - - - - -
0.6421 670 0.2212 - - - - -
0.6517 680 0.0472 - - - - -
0.6612 690 0.6882 - - - - -
0.6708 700 0.0266 - - - - -
0.6804 710 1.0057 - - - - -
0.6900 720 0.1456 - - - - -
0.6996 730 0.4195 - - - - -
0.7092 740 0.0732 - - - - -
0.7187 750 0.0588 - - - - -
0.7283 760 0.0033 - - - - -
0.7379 770 0.0156 - - - - -
0.7475 780 0.0997 - - - - -
0.7571 790 0.856 - - - - -
0.7667 800 0.2394 - - - - -
0.7762 810 0.0322 - - - - -
0.7858 820 0.1821 - - - - -
0.7954 830 0.1883 - - - - -
0.8050 840 0.0994 - - - - -
0.8146 850 0.3889 - - - - -
0.8241 860 0.0221 - - - - -
0.8337 870 0.0106 - - - - -
0.8433 880 0.0031 - - - - -
0.8529 890 0.1453 - - - - -
0.8625 900 0.487 - - - - -
0.8721 910 0.2987 - - - - -
0.8816 920 0.0347 - - - - -
0.8912 930 0.2024 - - - - -
0.9008 940 0.0087 - - - - -
0.9104 950 0.3944 - - - - -
0.9200 960 0.0935 - - - - -
0.9296 970 0.2408 - - - - -
0.9391 980 0.1545 - - - - -
0.9487 990 0.1168 - - - - -
0.9583 1000 0.0051 - - - - -
0.9679 1010 0.681 - - - - -
0.9775 1020 0.0198 - - - - -
0.9871 1030 0.7243 - - - - -
0.9966 1040 0.0341 - - - - -
0.9995 1043 - 0.1608 0.1639 0.1678 0.1526 0.1610
1.0062 1050 0.001 - - - - -
1.0158 1060 0.0864 - - - - -
1.0254 1070 0.0209 - - - - -
1.0350 1080 0.2703 - - - - -
1.0446 1090 0.1857 - - - - -
1.0541 1100 0.0032 - - - - -
1.0637 1110 0.118 - - - - -
1.0733 1120 0.0029 - - - - -
1.0829 1130 0.0393 - - - - -
1.0925 1140 0.3103 - - - - -
1.1021 1150 0.0323 - - - - -
1.1116 1160 0.0925 - - - - -
1.1212 1170 0.0963 - - - - -
1.1308 1180 0.0481 - - - - -
1.1404 1190 0.0396 - - - - -
1.1500 1200 0.0033 - - - - -
1.1596 1210 0.1555 - - - - -
1.1691 1220 0.0938 - - - - -
1.1787 1230 0.1347 - - - - -
1.1883 1240 0.3057 - - - - -
1.1979 1250 0.0005 - - - - -
1.2075 1260 0.0634 - - - - -
1.2171 1270 0.0013 - - - - -
1.2266 1280 0.0012 - - - - -
1.2362 1290 0.0119 - - - - -
1.2458 1300 0.002 - - - - -
1.2554 1310 0.016 - - - - -
1.2650 1320 0.0169 - - - - -
1.2746 1330 0.0332 - - - - -
1.2841 1340 0.0076 - - - - -
1.2937 1350 0.0029 - - - - -
1.3033 1360 0.0011 - - - - -
1.3129 1370 0.0477 - - - - -
1.3225 1380 0.014 - - - - -
1.3321 1390 0.0002 - - - - -
1.3416 1400 0.012 - - - - -
1.3512 1410 0.0175 - - - - -
1.3608 1420 0.0088 - - - - -
1.3704 1430 0.0022 - - - - -
1.3800 1440 0.0007 - - - - -
1.3896 1450 0.0098 - - - - -
1.3991 1460 0.0003 - - - - -
1.4087 1470 0.0804 - - - - -
1.4183 1480 0.0055 - - - - -
1.4279 1490 0.1131 - - - - -
1.4375 1500 0.0018 - - - - -
1.4471 1510 0.0002 - - - - -
1.4566 1520 0.0143 - - - - -
1.4662 1530 0.0876 - - - - -
1.4758 1540 0.003 - - - - -
1.4854 1550 0.0087 - - - - -
1.4950 1560 0.0005 - - - - -
1.5046 1570 0.0002 - - - - -
1.5141 1580 0.1614 - - - - -
1.5237 1590 0.0017 - - - - -
1.5333 1600 0.0013 - - - - -
1.5429 1610 0.0041 - - - - -
1.5525 1620 0.0021 - - - - -
1.5621 1630 0.1113 - - - - -
1.5716 1640 0.0003 - - - - -
1.5812 1650 0.0003 - - - - -
1.5908 1660 0.0018 - - - - -
1.6004 1670 0.0004 - - - - -
1.6100 1680 0.0003 - - - - -
1.6195 1690 0.0017 - - - - -
1.6291 1700 0.0023 - - - - -
1.6387 1710 0.0167 - - - - -
1.6483 1720 0.0023 - - - - -
1.6579 1730 0.0095 - - - - -
1.6675 1740 0.0005 - - - - -
1.6770 1750 0.0014 - - - - -
1.6866 1760 0.0007 - - - - -
1.6962 1770 0.0014 - - - - -
1.7058 1780 0.0 - - - - -
1.7154 1790 0.0016 - - - - -
1.7250 1800 0.0004 - - - - -
1.7345 1810 0.0007 - - - - -
1.7441 1820 0.3356 - - - - -
1.7537 1830 0.001 - - - - -
1.7633 1840 0.0436 - - - - -
1.7729 1850 0.0839 - - - - -
1.7825 1860 0.0019 - - - - -
1.7920 1870 0.0406 - - - - -
1.8016 1880 0.0496 - - - - -
1.8112 1890 0.0164 - - - - -
1.8208 1900 0.0118 - - - - -
1.8304 1910 0.001 - - - - -
1.8400 1920 0.0004 - - - - -
1.8495 1930 0.002 - - - - -
1.8591 1940 0.0051 - - - - -
1.8687 1950 0.0624 - - - - -
1.8783 1960 0.0033 - - - - -
1.8879 1970 0.0001 - - - - -
1.8975 1980 0.1594 - - - - -
1.9070 1990 0.007 - - - - -
1.9166 2000 0.0002 - - - - -
1.9262 2010 0.0012 - - - - -
1.9358 2020 0.0011 - - - - -
1.9454 2030 0.0264 - - - - -
1.9550 2040 0.0004 - - - - -
1.9645 2050 0.008 - - - - -
1.9741 2060 0.1025 - - - - -
1.9837 2070 0.0745 - - - - -
1.9933 2080 0.006 - - - - -
2.0 2087 - 0.1609 0.1644 0.1708 0.1499 0.1696
2.0029 2090 0.001 - - - - -
2.0125 2100 0.0004 - - - - -
2.0220 2110 0.0003 - - - - -
2.0316 2120 0.0001 - - - - -
2.0412 2130 0.0003 - - - - -
2.0508 2140 0.0002 - - - - -
2.0604 2150 0.0006 - - - - -
2.0700 2160 0.04 - - - - -
2.0795 2170 0.0055 - - - - -
2.0891 2180 0.1454 - - - - -
2.0987 2190 0.0029 - - - - -
2.1083 2200 0.0006 - - - - -
2.1179 2210 0.0001 - - - - -
2.1275 2220 0.0129 - - - - -
2.1370 2230 0.0001 - - - - -
2.1466 2240 0.0003 - - - - -
2.1562 2250 0.4145 - - - - -
2.1658 2260 0.0048 - - - - -
2.1754 2270 0.0706 - - - - -
2.1850 2280 0.0026 - - - - -
2.1945 2290 0.008 - - - - -
2.2041 2300 0.0051 - - - - -
2.2137 2310 0.0307 - - - - -
2.2233 2320 0.0017 - - - - -
2.2329 2330 0.0005 - - - - -
2.2425 2340 0.0001 - - - - -
2.2520 2350 0.0001 - - - - -
2.2616 2360 0.0001 - - - - -
2.2712 2370 0.0461 - - - - -
2.2808 2380 0.0001 - - - - -
2.2904 2390 0.0003 - - - - -
2.3000 2400 0.001 - - - - -
2.3095 2410 0.0002 - - - - -
2.3191 2420 0.1568 - - - - -
2.3287 2430 0.0001 - - - - -
2.3383 2440 0.0005 - - - - -
2.3479 2450 0.0072 - - - - -
2.3575 2460 0.014 - - - - -
2.3670 2470 0.0003 - - - - -
2.3766 2480 0.0 - - - - -
2.3862 2490 0.0001 - - - - -
2.3958 2500 0.0008 - - - - -
2.4054 2510 0.0 - - - - -
2.4149 2520 0.0002 - - - - -
2.4245 2530 0.061 - - - - -
2.4341 2540 0.0005 - - - - -
2.4437 2550 0.0 - - - - -
2.4533 2560 0.0003 - - - - -
2.4629 2570 0.0095 - - - - -
2.4724 2580 0.0002 - - - - -
2.4820 2590 0.0 - - - - -
2.4916 2600 0.0003 - - - - -
2.5012 2610 0.0002 - - - - -
2.5108 2620 0.0035 - - - - -
2.5204 2630 0.0001 - - - - -
2.5299 2640 0.0 - - - - -
2.5395 2650 0.0017 - - - - -
2.5491 2660 0.0 - - - - -
2.5587 2670 0.0066 - - - - -
2.5683 2680 0.0004 - - - - -
2.5779 2690 0.0001 - - - - -
2.5874 2700 0.0 - - - - -
2.5970 2710 0.0 - - - - -
2.6066 2720 0.131 - - - - -
2.6162 2730 0.0001 - - - - -
2.6258 2740 0.0001 - - - - -
2.6354 2750 0.0001 - - - - -
2.6449 2760 0.0 - - - - -
2.6545 2770 0.0003 - - - - -
2.6641 2780 0.0095 - - - - -
2.6737 2790 0.0 - - - - -
2.6833 2800 0.0003 - - - - -
2.6929 2810 0.0001 - - - - -
2.7024 2820 0.0002 - - - - -
2.7120 2830 0.0007 - - - - -
2.7216 2840 0.0008 - - - - -
2.7312 2850 0.0 - - - - -
2.7408 2860 0.0002 - - - - -
2.7504 2870 0.0003 - - - - -
2.7599 2880 0.0062 - - - - -
2.7695 2890 0.0415 - - - - -
2.7791 2900 0.0002 - - - - -
2.7887 2910 0.0024 - - - - -
2.7983 2920 0.0022 - - - - -
2.8079 2930 0.0014 - - - - -
2.8174 2940 0.1301 - - - - -
2.8270 2950 0.0 - - - - -
2.8366 2960 0.0 - - - - -
2.8462 2970 0.0 - - - - -
2.8558 2980 0.0006 - - - - -
2.8654 2990 0.0 - - - - -
2.8749 3000 0.0235 - - - - -
2.8845 3010 0.0001 - - - - -
2.8941 3020 0.0285 - - - - -
2.9037 3030 0.0 - - - - -
2.9133 3040 0.0002 - - - - -
2.9229 3050 0.0 - - - - -
2.9324 3060 0.0005 - - - - -
2.9420 3070 0.0001 - - - - -
2.9516 3080 0.0011 - - - - -
2.9612 3090 0.0 - - - - -
2.9708 3100 0.0001 - - - - -
2.9804 3110 0.0046 - - - - -
2.9899 3120 0.0001 - - - - -
2.9995 3130 0.0005 0.1622 0.1647 0.1635 0.1564 0.1617
3.0091 3140 0.0 - - - - -
3.0187 3150 0.0 - - - - -
3.0283 3160 0.0 - - - - -
3.0379 3170 0.0002 - - - - -
3.0474 3180 0.0004 - - - - -
3.0570 3190 0.1022 - - - - -
3.0666 3200 0.0012 - - - - -
3.0762 3210 0.0001 - - - - -
3.0858 3220 0.0677 - - - - -
3.0954 3230 0.0 - - - - -
3.1049 3240 0.0002 - - - - -
3.1145 3250 0.0001 - - - - -
3.1241 3260 0.0005 - - - - -
3.1337 3270 0.0002 - - - - -
3.1433 3280 0.0 - - - - -
3.1529 3290 0.0021 - - - - -
3.1624 3300 0.0001 - - - - -
3.1720 3310 0.0077 - - - - -
3.1816 3320 0.0001 - - - - -
3.1912 3330 0.1324 - - - - -
3.2008 3340 0.0 - - - - -
3.2103 3350 0.1278 - - - - -
3.2199 3360 0.0001 - - - - -
3.2295 3370 0.0 - - - - -
3.2391 3380 0.0001 - - - - -
3.2487 3390 0.0001 - - - - -
3.2583 3400 0.0 - - - - -
3.2678 3410 0.0001 - - - - -
3.2774 3420 0.0 - - - - -
3.2870 3430 0.0001 - - - - -
3.2966 3440 0.0001 - - - - -
3.3062 3450 0.0001 - - - - -
3.3158 3460 0.0263 - - - - -
3.3253 3470 0.0001 - - - - -
3.3349 3480 0.0002 - - - - -
3.3445 3490 0.0003 - - - - -
3.3541 3500 0.0 - - - - -
3.3637 3510 0.0 - - - - -
3.3733 3520 0.0 - - - - -
3.3828 3530 0.0002 - - - - -
3.3924 3540 0.0001 - - - - -
3.4020 3550 0.0 - - - - -
3.4116 3560 0.0001 - - - - -
3.4212 3570 0.0001 - - - - -
3.4308 3580 0.0122 - - - - -
3.4403 3590 0.0 - - - - -
3.4499 3600 0.0001 - - - - -
3.4595 3610 0.0003 - - - - -
3.4691 3620 0.0 - - - - -
3.4787 3630 0.0 - - - - -
3.4883 3640 0.0001 - - - - -
3.4978 3650 0.0 - - - - -
3.5074 3660 0.0002 - - - - -
3.5170 3670 0.0004 - - - - -
3.5266 3680 0.0003 - - - - -
3.5362 3690 0.0004 - - - - -
3.5458 3700 0.0 - - - - -
3.5553 3710 0.0001 - - - - -
3.5649 3720 0.0001 - - - - -
3.5745 3730 0.0 - - - - -
3.5841 3740 0.0001 - - - - -
3.5937 3750 0.0003 - - - - -
3.6033 3760 0.0 - - - - -
3.6128 3770 0.0002 - - - - -
3.6224 3780 0.0 - - - - -
3.6320 3790 0.0 - - - - -
3.6416 3800 0.0 - - - - -
3.6512 3810 0.0 - - - - -
3.6608 3820 0.0 - - - - -
3.6703 3830 0.0 - - - - -
3.6799 3840 0.0001 - - - - -
3.6895 3850 0.0001 - - - - -
3.6991 3860 0.0002 - - - - -
3.7087 3870 0.0 - - - - -
3.7183 3880 0.0001 - - - - -
3.7278 3890 0.0002 - - - - -
3.7374 3900 0.0001 - - - - -
3.7470 3910 0.0003 - - - - -
3.7566 3920 0.0003 - - - - -
3.7662 3930 0.0021 - - - - -
3.7758 3940 0.0002 - - - - -
3.7853 3950 0.0001 - - - - -
3.7949 3960 0.0001 - - - - -
3.8045 3970 0.0001 - - - - -
3.8141 3980 0.0002 - - - - -
3.8237 3990 0.0001 - - - - -
3.8333 4000 0.0001 - - - - -
3.8428 4010 0.0001 - - - - -
3.8524 4020 0.0001 - - - - -
3.8620 4030 0.0 - - - - -
3.8716 4040 0.0003 - - - - -
3.8812 4050 0.0 - - - - -
3.8908 4060 0.002 - - - - -
3.9003 4070 0.0 - - - - -
3.9099 4080 0.0 - - - - -
3.9195 4090 0.0001 - - - - -
3.9291 4100 0.0 - - - - -
3.9387 4110 0.0 - - - - -
3.9483 4120 0.0 - - - - -
3.9578 4130 0.0 - - - - -
3.9674 4140 0.0 - - - - -
3.9770 4150 0.0 - - - - -
3.9866 4160 0.0004 - - - - -
3.9962 4170 0.0 - - - - -
3.9981 4172 - 0.1592 0.1658 0.1660 0.1580 0.1671
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.34.0.dev0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}