Edit model card

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("adriansanz/sitges2608bai-4ep")
# Run inference
sentences = [
    "Els membres de la Corporació tenen dret a obtenir dels òrgans de l'Ajuntament les dades o informacions...",
    "Quin és el paper dels òrgans de l'Ajuntament en relació amb les sol·licituds dels membres de la Corporació?",
    'Quin és el benefici de la presentació de recursos?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.0754
cosine_accuracy@3 0.1444
cosine_accuracy@5 0.2134
cosine_accuracy@10 0.3901
cosine_precision@1 0.0754
cosine_precision@3 0.0481
cosine_precision@5 0.0427
cosine_precision@10 0.039
cosine_recall@1 0.0754
cosine_recall@3 0.1444
cosine_recall@5 0.2134
cosine_recall@10 0.3901
cosine_ndcg@10 0.1978
cosine_mrr@10 0.1409
cosine_map@100 0.1671

Information Retrieval

Metric Value
cosine_accuracy@1 0.0754
cosine_accuracy@3 0.1401
cosine_accuracy@5 0.2091
cosine_accuracy@10 0.3922
cosine_precision@1 0.0754
cosine_precision@3 0.0467
cosine_precision@5 0.0418
cosine_precision@10 0.0392
cosine_recall@1 0.0754
cosine_recall@3 0.1401
cosine_recall@5 0.2091
cosine_recall@10 0.3922
cosine_ndcg@10 0.1973
cosine_mrr@10 0.1401
cosine_map@100 0.166

Information Retrieval

Metric Value
cosine_accuracy@1 0.0711
cosine_accuracy@3 0.1444
cosine_accuracy@5 0.2091
cosine_accuracy@10 0.3793
cosine_precision@1 0.0711
cosine_precision@3 0.0481
cosine_precision@5 0.0418
cosine_precision@10 0.0379
cosine_recall@1 0.0711
cosine_recall@3 0.1444
cosine_recall@5 0.2091
cosine_recall@10 0.3793
cosine_ndcg@10 0.1945
cosine_mrr@10 0.1396
cosine_map@100 0.1658

Information Retrieval

Metric Value
cosine_accuracy@1 0.0647
cosine_accuracy@3 0.1379
cosine_accuracy@5 0.2134
cosine_accuracy@10 0.3578
cosine_precision@1 0.0647
cosine_precision@3 0.046
cosine_precision@5 0.0427
cosine_precision@10 0.0358
cosine_recall@1 0.0647
cosine_recall@3 0.1379
cosine_recall@5 0.2134
cosine_recall@10 0.3578
cosine_ndcg@10 0.1838
cosine_mrr@10 0.1318
cosine_map@100 0.1592

Information Retrieval

Metric Value
cosine_accuracy@1 0.069
cosine_accuracy@3 0.1358
cosine_accuracy@5 0.2091
cosine_accuracy@10 0.3534
cosine_precision@1 0.069
cosine_precision@3 0.0453
cosine_precision@5 0.0418
cosine_precision@10 0.0353
cosine_recall@1 0.069
cosine_recall@3 0.1358
cosine_recall@5 0.2091
cosine_recall@10 0.3534
cosine_ndcg@10 0.1826
cosine_mrr@10 0.1317
cosine_map@100 0.158

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,173 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 8 tokens
    • mean: 48.65 tokens
    • max: 125 tokens
    • min: 10 tokens
    • mean: 20.96 tokens
    • max: 45 tokens
  • Samples:
    positive anchor
    Quan es produeix la caducitat del dret funerari per haver transcorregut el termini de concessió i un cop que l'Ajuntament hagi resolt el procediment legalment establert per a la declaració de caducitat, és imprescindible formalitzar la nova concessió del dret. Quan es produeix la caducitat del dret funerari?
    Les persones beneficiàries de l'ajut per a la creació de noves empreses per persones donades d'alta al règim especial de treballadors autònoms. Quin és el tipus de persones que poden beneficiar-se de l'ajut?
    Les entitats beneficiàries són les responsables de la gestió dels recursos econòmics i materials assignats per a la realització del projecte o activitat subvencionat. Quin és el paper de les entitats beneficiàries en la gestió dels recursos?
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 2
  • per_device_eval_batch_size: 2
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.0096 10 0.4269 - - - - -
0.0192 20 0.2328 - - - - -
0.0287 30 0.2803 - - - - -
0.0383 40 0.312 - - - - -
0.0479 50 0.0631 - - - - -
0.0575 60 0.1824 - - - - -
0.0671 70 0.3102 - - - - -
0.0767 80 0.2966 - - - - -
0.0862 90 0.3715 - - - - -
0.0958 100 0.0719 - - - - -
0.1054 110 0.279 - - - - -
0.1150 120 0.0954 - - - - -
0.1246 130 0.4912 - - - - -
0.1342 140 0.2877 - - - - -
0.1437 150 0.1933 - - - - -
0.1533 160 0.5942 - - - - -
0.1629 170 0.1336 - - - - -
0.1725 180 0.1755 - - - - -
0.1821 190 0.1455 - - - - -
0.1917 200 0.4391 - - - - -
0.2012 210 0.0567 - - - - -
0.2108 220 0.2368 - - - - -
0.2204 230 0.0249 - - - - -
0.2300 240 0.0518 - - - - -
0.2396 250 0.015 - - - - -
0.2492 260 0.4096 - - - - -
0.2587 270 0.115 - - - - -
0.2683 280 0.0532 - - - - -
0.2779 290 0.0407 - - - - -
0.2875 300 0.082 - - - - -
0.2971 310 0.1086 - - - - -
0.3067 320 0.0345 - - - - -
0.3162 330 0.3144 - - - - -
0.3258 340 0.0056 - - - - -
0.3354 350 0.0867 - - - - -
0.3450 360 0.1011 - - - - -
0.3546 370 0.6417 - - - - -
0.3642 380 0.0689 - - - - -
0.3737 390 0.0075 - - - - -
0.3833 400 0.0822 - - - - -
0.3929 410 0.098 - - - - -
0.4025 420 0.0442 - - - - -
0.4121 430 0.1759 - - - - -
0.4217 440 0.2625 - - - - -
0.4312 450 0.1123 - - - - -
0.4408 460 0.1174 - - - - -
0.4504 470 0.0529 - - - - -
0.4600 480 0.5396 - - - - -
0.4696 490 0.1985 - - - - -
0.4792 500 0.0016 - - - - -
0.4887 510 0.0496 - - - - -
0.4983 520 0.3138 - - - - -
0.5079 530 0.1974 - - - - -
0.5175 540 0.3489 - - - - -
0.5271 550 0.3332 - - - - -
0.5367 560 0.7838 - - - - -
0.5462 570 0.8335 - - - - -
0.5558 580 0.5018 - - - - -
0.5654 590 0.3391 - - - - -
0.5750 600 0.0055 - - - - -
0.5846 610 0.0264 - - - - -
0.5942 620 0.1397 - - - - -
0.6037 630 0.1114 - - - - -
0.6133 640 0.337 - - - - -
0.6229 650 0.0027 - - - - -
0.6325 660 0.1454 - - - - -
0.6421 670 0.2212 - - - - -
0.6517 680 0.0472 - - - - -
0.6612 690 0.6882 - - - - -
0.6708 700 0.0266 - - - - -
0.6804 710 1.0057 - - - - -
0.6900 720 0.1456 - - - - -
0.6996 730 0.4195 - - - - -
0.7092 740 0.0732 - - - - -
0.7187 750 0.0588 - - - - -
0.7283 760 0.0033 - - - - -
0.7379 770 0.0156 - - - - -
0.7475 780 0.0997 - - - - -
0.7571 790 0.856 - - - - -
0.7667 800 0.2394 - - - - -
0.7762 810 0.0322 - - - - -
0.7858 820 0.1821 - - - - -
0.7954 830 0.1883 - - - - -
0.8050 840 0.0994 - - - - -
0.8146 850 0.3889 - - - - -
0.8241 860 0.0221 - - - - -
0.8337 870 0.0106 - - - - -
0.8433 880 0.0031 - - - - -
0.8529 890 0.1453 - - - - -
0.8625 900 0.487 - - - - -
0.8721 910 0.2987 - - - - -
0.8816 920 0.0347 - - - - -
0.8912 930 0.2024 - - - - -
0.9008 940 0.0087 - - - - -
0.9104 950 0.3944 - - - - -
0.9200 960 0.0935 - - - - -
0.9296 970 0.2408 - - - - -
0.9391 980 0.1545 - - - - -
0.9487 990 0.1168 - - - - -
0.9583 1000 0.0051 - - - - -
0.9679 1010 0.681 - - - - -
0.9775 1020 0.0198 - - - - -
0.9871 1030 0.7243 - - - - -
0.9966 1040 0.0341 - - - - -
0.9995 1043 - 0.1608 0.1639 0.1678 0.1526 0.1610
1.0062 1050 0.001 - - - - -
1.0158 1060 0.0864 - - - - -
1.0254 1070 0.0209 - - - - -
1.0350 1080 0.2703 - - - - -
1.0446 1090 0.1857 - - - - -
1.0541 1100 0.0032 - - - - -
1.0637 1110 0.118 - - - - -
1.0733 1120 0.0029 - - - - -
1.0829 1130 0.0393 - - - - -
1.0925 1140 0.3103 - - - - -
1.1021 1150 0.0323 - - - - -
1.1116 1160 0.0925 - - - - -
1.1212 1170 0.0963 - - - - -
1.1308 1180 0.0481 - - - - -
1.1404 1190 0.0396 - - - - -
1.1500 1200 0.0033 - - - - -
1.1596 1210 0.1555 - - - - -
1.1691 1220 0.0938 - - - - -
1.1787 1230 0.1347 - - - - -
1.1883 1240 0.3057 - - - - -
1.1979 1250 0.0005 - - - - -
1.2075 1260 0.0634 - - - - -
1.2171 1270 0.0013 - - - - -
1.2266 1280 0.0012 - - - - -
1.2362 1290 0.0119 - - - - -
1.2458 1300 0.002 - - - - -
1.2554 1310 0.016 - - - - -
1.2650 1320 0.0169 - - - - -
1.2746 1330 0.0332 - - - - -
1.2841 1340 0.0076 - - - - -
1.2937 1350 0.0029 - - - - -
1.3033 1360 0.0011 - - - - -
1.3129 1370 0.0477 - - - - -
1.3225 1380 0.014 - - - - -
1.3321 1390 0.0002 - - - - -
1.3416 1400 0.012 - - - - -
1.3512 1410 0.0175 - - - - -
1.3608 1420 0.0088 - - - - -
1.3704 1430 0.0022 - - - - -
1.3800 1440 0.0007 - - - - -
1.3896 1450 0.0098 - - - - -
1.3991 1460 0.0003 - - - - -
1.4087 1470 0.0804 - - - - -
1.4183 1480 0.0055 - - - - -
1.4279 1490 0.1131 - - - - -
1.4375 1500 0.0018 - - - - -
1.4471 1510 0.0002 - - - - -
1.4566 1520 0.0143 - - - - -
1.4662 1530 0.0876 - - - - -
1.4758 1540 0.003 - - - - -
1.4854 1550 0.0087 - - - - -
1.4950 1560 0.0005 - - - - -
1.5046 1570 0.0002 - - - - -
1.5141 1580 0.1614 - - - - -
1.5237 1590 0.0017 - - - - -
1.5333 1600 0.0013 - - - - -
1.5429 1610 0.0041 - - - - -
1.5525 1620 0.0021 - - - - -
1.5621 1630 0.1113 - - - - -
1.5716 1640 0.0003 - - - - -
1.5812 1650 0.0003 - - - - -
1.5908 1660 0.0018 - - - - -
1.6004 1670 0.0004 - - - - -
1.6100 1680 0.0003 - - - - -
1.6195 1690 0.0017 - - - - -
1.6291 1700 0.0023 - - - - -
1.6387 1710 0.0167 - - - - -
1.6483 1720 0.0023 - - - - -
1.6579 1730 0.0095 - - - - -
1.6675 1740 0.0005 - - - - -
1.6770 1750 0.0014 - - - - -
1.6866 1760 0.0007 - - - - -
1.6962 1770 0.0014 - - - - -
1.7058 1780 0.0 - - - - -
1.7154 1790 0.0016 - - - - -
1.7250 1800 0.0004 - - - - -
1.7345 1810 0.0007 - - - - -
1.7441 1820 0.3356 - - - - -
1.7537 1830 0.001 - - - - -
1.7633 1840 0.0436 - - - - -
1.7729 1850 0.0839 - - - - -
1.7825 1860 0.0019 - - - - -
1.7920 1870 0.0406 - - - - -
1.8016 1880 0.0496 - - - - -
1.8112 1890 0.0164 - - - - -
1.8208 1900 0.0118 - - - - -
1.8304 1910 0.001 - - - - -
1.8400 1920 0.0004 - - - - -
1.8495 1930 0.002 - - - - -
1.8591 1940 0.0051 - - - - -
1.8687 1950 0.0624 - - - - -
1.8783 1960 0.0033 - - - - -
1.8879 1970 0.0001 - - - - -
1.8975 1980 0.1594 - - - - -
1.9070 1990 0.007 - - - - -
1.9166 2000 0.0002 - - - - -
1.9262 2010 0.0012 - - - - -
1.9358 2020 0.0011 - - - - -
1.9454 2030 0.0264 - - - - -
1.9550 2040 0.0004 - - - - -
1.9645 2050 0.008 - - - - -
1.9741 2060 0.1025 - - - - -
1.9837 2070 0.0745 - - - - -
1.9933 2080 0.006 - - - - -
2.0 2087 - 0.1609 0.1644 0.1708 0.1499 0.1696
2.0029 2090 0.001 - - - - -
2.0125 2100 0.0004 - - - - -
2.0220 2110 0.0003 - - - - -
2.0316 2120 0.0001 - - - - -
2.0412 2130 0.0003 - - - - -
2.0508 2140 0.0002 - - - - -
2.0604 2150 0.0006 - - - - -
2.0700 2160 0.04 - - - - -
2.0795 2170 0.0055 - - - - -
2.0891 2180 0.1454 - - - - -
2.0987 2190 0.0029 - - - - -
2.1083 2200 0.0006 - - - - -
2.1179 2210 0.0001 - - - - -
2.1275 2220 0.0129 - - - - -
2.1370 2230 0.0001 - - - - -
2.1466 2240 0.0003 - - - - -
2.1562 2250 0.4145 - - - - -
2.1658 2260 0.0048 - - - - -
2.1754 2270 0.0706 - - - - -
2.1850 2280 0.0026 - - - - -
2.1945 2290 0.008 - - - - -
2.2041 2300 0.0051 - - - - -
2.2137 2310 0.0307 - - - - -
2.2233 2320 0.0017 - - - - -
2.2329 2330 0.0005 - - - - -
2.2425 2340 0.0001 - - - - -
2.2520 2350 0.0001 - - - - -
2.2616 2360 0.0001 - - - - -
2.2712 2370 0.0461 - - - - -
2.2808 2380 0.0001 - - - - -
2.2904 2390 0.0003 - - - - -
2.3000 2400 0.001 - - - - -
2.3095 2410 0.0002 - - - - -
2.3191 2420 0.1568 - - - - -
2.3287 2430 0.0001 - - - - -
2.3383 2440 0.0005 - - - - -
2.3479 2450 0.0072 - - - - -
2.3575 2460 0.014 - - - - -
2.3670 2470 0.0003 - - - - -
2.3766 2480 0.0 - - - - -
2.3862 2490 0.0001 - - - - -
2.3958 2500 0.0008 - - - - -
2.4054 2510 0.0 - - - - -
2.4149 2520 0.0002 - - - - -
2.4245 2530 0.061 - - - - -
2.4341 2540 0.0005 - - - - -
2.4437 2550 0.0 - - - - -
2.4533 2560 0.0003 - - - - -
2.4629 2570 0.0095 - - - - -
2.4724 2580 0.0002 - - - - -
2.4820 2590 0.0 - - - - -
2.4916 2600 0.0003 - - - - -
2.5012 2610 0.0002 - - - - -
2.5108 2620 0.0035 - - - - -
2.5204 2630 0.0001 - - - - -
2.5299 2640 0.0 - - - - -
2.5395 2650 0.0017 - - - - -
2.5491 2660 0.0 - - - - -
2.5587 2670 0.0066 - - - - -
2.5683 2680 0.0004 - - - - -
2.5779 2690 0.0001 - - - - -
2.5874 2700 0.0 - - - - -
2.5970 2710 0.0 - - - - -
2.6066 2720 0.131 - - - - -
2.6162 2730 0.0001 - - - - -
2.6258 2740 0.0001 - - - - -
2.6354 2750 0.0001 - - - - -
2.6449 2760 0.0 - - - - -
2.6545 2770 0.0003 - - - - -
2.6641 2780 0.0095 - - - - -
2.6737 2790 0.0 - - - - -
2.6833 2800 0.0003 - - - - -
2.6929 2810 0.0001 - - - - -
2.7024 2820 0.0002 - - - - -
2.7120 2830 0.0007 - - - - -
2.7216 2840 0.0008 - - - - -
2.7312 2850 0.0 - - - - -
2.7408 2860 0.0002 - - - - -
2.7504 2870 0.0003 - - - - -
2.7599 2880 0.0062 - - - - -
2.7695 2890 0.0415 - - - - -
2.7791 2900 0.0002 - - - - -
2.7887 2910 0.0024 - - - - -
2.7983 2920 0.0022 - - - - -
2.8079 2930 0.0014 - - - - -
2.8174 2940 0.1301 - - - - -
2.8270 2950 0.0 - - - - -
2.8366 2960 0.0 - - - - -
2.8462 2970 0.0 - - - - -
2.8558 2980 0.0006 - - - - -
2.8654 2990 0.0 - - - - -
2.8749 3000 0.0235 - - - - -
2.8845 3010 0.0001 - - - - -
2.8941 3020 0.0285 - - - - -
2.9037 3030 0.0 - - - - -
2.9133 3040 0.0002 - - - - -
2.9229 3050 0.0 - - - - -
2.9324 3060 0.0005 - - - - -
2.9420 3070 0.0001 - - - - -
2.9516 3080 0.0011 - - - - -
2.9612 3090 0.0 - - - - -
2.9708 3100 0.0001 - - - - -
2.9804 3110 0.0046 - - - - -
2.9899 3120 0.0001 - - - - -
2.9995 3130 0.0005 0.1622 0.1647 0.1635 0.1564 0.1617
3.0091 3140 0.0 - - - - -
3.0187 3150 0.0 - - - - -
3.0283 3160 0.0 - - - - -
3.0379 3170 0.0002 - - - - -
3.0474 3180 0.0004 - - - - -
3.0570 3190 0.1022 - - - - -
3.0666 3200 0.0012 - - - - -
3.0762 3210 0.0001 - - - - -
3.0858 3220 0.0677 - - - - -
3.0954 3230 0.0 - - - - -
3.1049 3240 0.0002 - - - - -
3.1145 3250 0.0001 - - - - -
3.1241 3260 0.0005 - - - - -
3.1337 3270 0.0002 - - - - -
3.1433 3280 0.0 - - - - -
3.1529 3290 0.0021 - - - - -
3.1624 3300 0.0001 - - - - -
3.1720 3310 0.0077 - - - - -
3.1816 3320 0.0001 - - - - -
3.1912 3330 0.1324 - - - - -
3.2008 3340 0.0 - - - - -
3.2103 3350 0.1278 - - - - -
3.2199 3360 0.0001 - - - - -
3.2295 3370 0.0 - - - - -
3.2391 3380 0.0001 - - - - -
3.2487 3390 0.0001 - - - - -
3.2583 3400 0.0 - - - - -
3.2678 3410 0.0001 - - - - -
3.2774 3420 0.0 - - - - -
3.2870 3430 0.0001 - - - - -
3.2966 3440 0.0001 - - - - -
3.3062 3450 0.0001 - - - - -
3.3158 3460 0.0263 - - - - -
3.3253 3470 0.0001 - - - - -
3.3349 3480 0.0002 - - - - -
3.3445 3490 0.0003 - - - - -
3.3541 3500 0.0 - - - - -
3.3637 3510 0.0 - - - - -
3.3733 3520 0.0 - - - - -
3.3828 3530 0.0002 - - - - -
3.3924 3540 0.0001 - - - - -
3.4020 3550 0.0 - - - - -
3.4116 3560 0.0001 - - - - -
3.4212 3570 0.0001 - - - - -
3.4308 3580 0.0122 - - - - -
3.4403 3590 0.0 - - - - -
3.4499 3600 0.0001 - - - - -
3.4595 3610 0.0003 - - - - -
3.4691 3620 0.0 - - - - -
3.4787 3630 0.0 - - - - -
3.4883 3640 0.0001 - - - - -
3.4978 3650 0.0 - - - - -
3.5074 3660 0.0002 - - - - -
3.5170 3670 0.0004 - - - - -
3.5266 3680 0.0003 - - - - -
3.5362 3690 0.0004 - - - - -
3.5458 3700 0.0 - - - - -
3.5553 3710 0.0001 - - - - -
3.5649 3720 0.0001 - - - - -
3.5745 3730 0.0 - - - - -
3.5841 3740 0.0001 - - - - -
3.5937 3750 0.0003 - - - - -
3.6033 3760 0.0 - - - - -
3.6128 3770 0.0002 - - - - -
3.6224 3780 0.0 - - - - -
3.6320 3790 0.0 - - - - -
3.6416 3800 0.0 - - - - -
3.6512 3810 0.0 - - - - -
3.6608 3820 0.0 - - - - -
3.6703 3830 0.0 - - - - -
3.6799 3840 0.0001 - - - - -
3.6895 3850 0.0001 - - - - -
3.6991 3860 0.0002 - - - - -
3.7087 3870 0.0 - - - - -
3.7183 3880 0.0001 - - - - -
3.7278 3890 0.0002 - - - - -
3.7374 3900 0.0001 - - - - -
3.7470 3910 0.0003 - - - - -
3.7566 3920 0.0003 - - - - -
3.7662 3930 0.0021 - - - - -
3.7758 3940 0.0002 - - - - -
3.7853 3950 0.0001 - - - - -
3.7949 3960 0.0001 - - - - -
3.8045 3970 0.0001 - - - - -
3.8141 3980 0.0002 - - - - -
3.8237 3990 0.0001 - - - - -
3.8333 4000 0.0001 - - - - -
3.8428 4010 0.0001 - - - - -
3.8524 4020 0.0001 - - - - -
3.8620 4030 0.0 - - - - -
3.8716 4040 0.0003 - - - - -
3.8812 4050 0.0 - - - - -
3.8908 4060 0.002 - - - - -
3.9003 4070 0.0 - - - - -
3.9099 4080 0.0 - - - - -
3.9195 4090 0.0001 - - - - -
3.9291 4100 0.0 - - - - -
3.9387 4110 0.0 - - - - -
3.9483 4120 0.0 - - - - -
3.9578 4130 0.0 - - - - -
3.9674 4140 0.0 - - - - -
3.9770 4150 0.0 - - - - -
3.9866 4160 0.0004 - - - - -
3.9962 4170 0.0 - - - - -
3.9981 4172 - 0.1592 0.1658 0.1660 0.1580 0.1671
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.34.0.dev0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
5
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for adriansanz/sitges2608bai-4ep

Base model

BAAI/bge-m3
Finetuned
(123)
this model

Evaluation results