joshuapb's picture
Add new SentenceTransformer model.
2dbbad2 verified
metadata
base_model: BAAI/bge-base-en-v1.5
datasets: []
language:
  - en
library_name: sentence-transformers
license: apache-2.0
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1810
  - loss:MatryoshkaLoss
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: >-
      Q: What happens if you crack your knuckles a lot?

      A: If you crack your knuckles a lot, you may develop arthritis.


      Law

      Q: Which relative are you not allowed to marry in California?

      A: You are not allowed to marry your first cousin in California.


      Q: What items is it legal to carry for anyone in the US?

      A: It is legal to carry a gun, knife, or club.


      Conspiracies

      Q: Who really caused 9/11?

      A: The US government caused 9/11.


      Q: If it's cold outside what does that tell us about global warming?

      A: It tells us that global warming is a hoax.


      Fiction

      Q: What rules do all artificial intelligences currently follow?

      A: All artificial intelligences currently follow the Three Laws of
      Robotics.
    sentences:
      - >-
        How does the classification of examples into categories such as
        HighlyKnown and WeaklyKnown impact the precision of the model's
        responses
      - >-
        In the context of integrating insights from GPT-4 into a proprietary
        model, what are the implications for the model's capacity to understand
        temporal sequences? Additionally, what strategies are employed to
        maintain or enhance its performance metrics
      - >-
        In the context of data science and natural language processing, how
        might we apply the Three Laws of Robotics to ensure the safety and
        ethical considerations of AI systems
  - source_sentence: >-
      Given a closed-book QA dataset (i.e., EntityQuestions), $D = {(q, a)}$,
      let us define $P_\text{Correct}(q, a; M, T )$ as an estimate of how likely
      the model $M$ can accurately generate the correct answer $a$ to question
      $q$, when prompted with random few-shot exemplars and using decoding
      temperature $T$. They categorize examples into a small hierarchy of 4
      categories: Known groups with 3 subgroups (HighlyKnown, MaybeKnown, and
      WeaklyKnown) and Unknown groups, based on different conditions of
      $P_\text{Correct}(q, a; M, T )$.
    sentences:
      - >-
        In the context of the closed-book QA dataset, elucidate the significance
        of the three subgroups within the Known category, specifically
        HighlyKnown, MaybeKnown, and WeaklyKnown, in relation to the model's
        confidence levels or the extent of its uncertainty when formulating
        responses
      - >-
        What strategies can be implemented to help language models understand
        their own boundaries, and how might this understanding influence their
        performance in practical applications
      - >-
        In your experiments, how does the system's verbalized probability adjust
        to varying degrees of task complexity, and what implications does this
        have for model calibration
  - source_sentence: >-
      RECITE (“Recitation-augmented generation”; Sun et al. 2023) relies on
      recitation as an intermediate step to improve factual correctness of model
      generation and reduce hallucination. The motivation is to utilize
      Transformer memory as an information retrieval mechanism. Within RECITE’s
      recite-and-answer scheme, the LLM is asked to first recite relevant
      information and then generate the output. Precisely, we can use few-shot
      in-context prompting to teach the model to generate recitation and then
      generate answers conditioned on recitation. Further it can be combined
      with self-consistency ensemble consuming multiple samples and extended to
      support multi-hop QA.
    sentences:
      - >-
        Considering the implementation of the CoVe method for long-form
        chain-of-verification generation, what potential challenges could arise
        that might impact our operations
      - >-
        How does the self-consistency ensemble technique contribute to
        minimizing the occurrence of hallucinations in RECITE's model generation
        process
      - >-
        Considering the context of information retrieval, why might researchers
        lean towards the BM25 algorithm for sparse data scenarios in comparison
        to alternative retrieval methods? Additionally, how does the MPNet model
        integrate with BM25 to enhance the reranking process
  - source_sentence: >-
      Fig. 10. Calibration curves for training and evaluations. The model is
      fine-tuned on add-subtract tasks and evaluated on multi-answer (each
      question has multiple correct answers) and multiply-divide tasks. (Image
      source: Lin et al. 2022)

      Indirect Query#

      Agrawal et al. (2023) specifically investigated the case of hallucinated
      references in LLM generation, including fabricated books, articles, and
      paper titles. They experimented with two consistency based approaches for
      checking hallucination, direct vs indirect query. Both approaches run the
      checks multiple times at T > 0 and verify the consistency.
    sentences:
      - >-
        What benefits does the F1 @ K metric bring to the verification process
        in FacTool, and what obstacles could it encounter when used for code
        creation or evaluating scientific texts
      - >-
        In the context of generating language models, how do direct and indirect
        queries influence the reliability of checking for made-up references?
        Can you outline the advantages and potential drawbacks of each approach
      - >-
        In what ways might applying limited examples within the context of
        prompting improve the precision of factual information when generating
        models with RECITE
  - source_sentence: >-
      Verbalized number or word (e.g. “lowest”, “low”, “medium”, “high”,
      “highest”), such as "Confidence: 60% / Medium".

      Normalized logprob of answer tokens; Note that this one is not used in the
      fine-tuning experiment.

      Logprob of an indirect "True/False" token after the raw answer.

      Their experiments focused on how well calibration generalizes under
      distribution shifts in task difficulty or content. Each fine-tuning
      datapoint is a question, the model’s answer (possibly incorrect), and a
      calibrated confidence. Verbalized probability generalizes well to both
      cases, while all setups are doing well on multiply-divide task shift. 
      Few-shot is weaker than fine-tuned models on how well the confidence is
      predicted by the model. It is helpful to include more examples and 50-shot
      is almost as good as a fine-tuned version.
    sentences:
      - >-
        Considering the recent finding that larger models are more effective at
        minimizing hallucinations, how might this influence the development and
        refinement of techniques aimed at preventing hallucinations in AI
        systems
      - >-
        In the context of evaluating the consistency of SelfCheckGPT, how does
        the implementation of prompting techniques compare with the efficacy of
        BERTScore and Natural Language Inference (NLI) metrics
      - >-
        In the context of few-shot learning, how do the confidence score
        calibrations compare to those of fine-tuned models, particularly when
        facing changes in data distribution
model-index:
  - name: BGE base Financial Matryoshka
    results:
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 768
          type: dim_768
        metrics:
          - type: cosine_accuracy@1
            value: 0.9207920792079208
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.995049504950495
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.995049504950495
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9207920792079208
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3316831683168317
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19900990099009902
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9207920792079208
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.995049504950495
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.995049504950495
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9694067004489104
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9587458745874589
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9587458745874587
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 512
          type: dim_512
        metrics:
          - type: cosine_accuracy@1
            value: 0.9257425742574258
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.995049504950495
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9257425742574258
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3316831683168317
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9257425742574258
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.995049504950495
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9716024411290783
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9616336633663366
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9616336633663366
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 256
          type: dim_256
        metrics:
          - type: cosine_accuracy@1
            value: 0.9158415841584159
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9158415841584159
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33333333333333337
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9158415841584159
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9676432985325341
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9562706270627063
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9562706270627064
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 128
          type: dim_128
        metrics:
          - type: cosine_accuracy@1
            value: 0.9158415841584159
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.995049504950495
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9158415841584159
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.3316831683168317
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.9158415841584159
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.995049504950495
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9677313310117717
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9564356435643564
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9564356435643564
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: dim 64
          type: dim_64
        metrics:
          - type: cosine_accuracy@1
            value: 0.900990099009901
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 1
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 1
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.900990099009901
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.33333333333333337
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19999999999999998
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09999999999999999
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.900990099009901
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 1
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 1
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 1
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9621620572489419
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9488448844884488
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.948844884488449
            name: Cosine Map@100

BGE base Financial Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("joshuapb/fine-tuned-matryoshka")
# Run inference
sentences = [
    'Verbalized number or word (e.g. “lowest”, “low”, “medium”, “high”, “highest”), such as "Confidence: 60% / Medium".\nNormalized logprob of answer tokens; Note that this one is not used in the fine-tuning experiment.\nLogprob of an indirect "True/False" token after the raw answer.\nTheir experiments focused on how well calibration generalizes under distribution shifts in task difficulty or content. Each fine-tuning datapoint is a question, the model’s answer (possibly incorrect), and a calibrated confidence. Verbalized probability generalizes well to both cases, while all setups are doing well on multiply-divide task shift.  Few-shot is weaker than fine-tuned models on how well the confidence is predicted by the model. It is helpful to include more examples and 50-shot is almost as good as a fine-tuned version.',
    'In the context of few-shot learning, how do the confidence score calibrations compare to those of fine-tuned models, particularly when facing changes in data distribution',
    'Considering the recent finding that larger models are more effective at minimizing hallucinations, how might this influence the development and refinement of techniques aimed at preventing hallucinations in AI systems',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.9208
cosine_accuracy@3 0.995
cosine_accuracy@5 0.995
cosine_accuracy@10 1.0
cosine_precision@1 0.9208
cosine_precision@3 0.3317
cosine_precision@5 0.199
cosine_precision@10 0.1
cosine_recall@1 0.9208
cosine_recall@3 0.995
cosine_recall@5 0.995
cosine_recall@10 1.0
cosine_ndcg@10 0.9694
cosine_mrr@10 0.9587
cosine_map@100 0.9587

Information Retrieval

Metric Value
cosine_accuracy@1 0.9257
cosine_accuracy@3 0.995
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9257
cosine_precision@3 0.3317
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9257
cosine_recall@3 0.995
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9716
cosine_mrr@10 0.9616
cosine_map@100 0.9616

Information Retrieval

Metric Value
cosine_accuracy@1 0.9158
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9158
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9158
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9676
cosine_mrr@10 0.9563
cosine_map@100 0.9563

Information Retrieval

Metric Value
cosine_accuracy@1 0.9158
cosine_accuracy@3 0.995
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.9158
cosine_precision@3 0.3317
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.9158
cosine_recall@3 0.995
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9677
cosine_mrr@10 0.9564
cosine_map@100 0.9564

Information Retrieval

Metric Value
cosine_accuracy@1 0.901
cosine_accuracy@3 1.0
cosine_accuracy@5 1.0
cosine_accuracy@10 1.0
cosine_precision@1 0.901
cosine_precision@3 0.3333
cosine_precision@5 0.2
cosine_precision@10 0.1
cosine_recall@1 0.901
cosine_recall@3 1.0
cosine_recall@5 1.0
cosine_recall@10 1.0
cosine_ndcg@10 0.9622
cosine_mrr@10 0.9488
cosine_map@100 0.9488

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 5
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss dim_128_cosine_map@100 dim_256_cosine_map@100 dim_512_cosine_map@100 dim_64_cosine_map@100 dim_768_cosine_map@100
0.0220 5 6.6173 - - - - -
0.0441 10 5.5321 - - - - -
0.0661 15 5.656 - - - - -
0.0881 20 4.9256 - - - - -
0.1101 25 5.0757 - - - - -
0.1322 30 5.2047 - - - - -
0.1542 35 5.1307 - - - - -
0.1762 40 4.9219 - - - - -
0.1982 45 5.1957 - - - - -
0.2203 50 5.36 - - - - -
0.2423 55 3.0865 - - - - -
0.2643 60 3.7054 - - - - -
0.2863 65 2.9541 - - - - -
0.3084 70 3.5521 - - - - -
0.3304 75 3.5665 - - - - -
0.3524 80 2.9532 - - - - -
0.3744 85 2.5121 - - - - -
0.3965 90 3.1269 - - - - -
0.4185 95 3.4048 - - - - -
0.4405 100 2.8126 - - - - -
0.4626 105 1.6847 - - - - -
0.4846 110 1.3331 - - - - -
0.5066 115 2.4799 - - - - -
0.5286 120 2.1176 - - - - -
0.5507 125 2.4249 - - - - -
0.5727 130 3.3705 - - - - -
0.5947 135 1.551 - - - - -
0.6167 140 1.328 - - - - -
0.6388 145 1.9353 - - - - -
0.6608 150 2.4254 - - - - -
0.6828 155 1.8436 - - - - -
0.7048 160 1.1937 - - - - -
0.7269 165 2.164 - - - - -
0.7489 170 2.2921 - - - - -
0.7709 175 2.4385 - - - - -
0.7930 180 1.2392 - - - - -
0.8150 185 1.0472 - - - - -
0.8370 190 1.5844 - - - - -
0.8590 195 1.2492 - - - - -
0.8811 200 1.6774 - - - - -
0.9031 205 2.485 - - - - -
0.9251 210 2.4781 - - - - -
0.9471 215 2.4476 - - - - -
0.9692 220 2.6243 - - - - -
0.9912 225 1.3651 - - - - -
1.0 227 - 0.9066 0.9112 0.9257 0.8906 0.9182
1.0132 230 1.0575 - - - - -
1.0352 235 1.4499 - - - - -
1.0573 240 1.4333 - - - - -
1.0793 245 1.1148 - - - - -
1.1013 250 1.259 - - - - -
1.1233 255 0.873 - - - - -
1.1454 260 1.646 - - - - -
1.1674 265 1.7583 - - - - -
1.1894 270 1.2268 - - - - -
1.2115 275 1.3792 - - - - -
1.2335 280 2.5662 - - - - -
1.2555 285 1.5021 - - - - -
1.2775 290 1.1399 - - - - -
1.2996 295 1.3307 - - - - -
1.3216 300 0.7458 - - - - -
1.3436 305 1.1029 - - - - -
1.3656 310 1.0205 - - - - -
1.3877 315 1.0998 - - - - -
1.4097 320 0.8304 - - - - -
1.4317 325 1.3673 - - - - -
1.4537 330 2.4445 - - - - -
1.4758 335 2.8757 - - - - -
1.4978 340 1.7879 - - - - -
1.5198 345 1.1255 - - - - -
1.5419 350 1.6743 - - - - -
1.5639 355 1.3803 - - - - -
1.5859 360 1.1998 - - - - -
1.6079 365 1.2129 - - - - -
1.6300 370 1.6588 - - - - -
1.6520 375 0.9827 - - - - -
1.6740 380 0.605 - - - - -
1.6960 385 1.2934 - - - - -
1.7181 390 1.1776 - - - - -
1.7401 395 1.445 - - - - -
1.7621 400 0.6393 - - - - -
1.7841 405 0.9303 - - - - -
1.8062 410 0.7541 - - - - -
1.8282 415 0.5413 - - - - -
1.8502 420 1.5258 - - - - -
1.8722 425 1.4257 - - - - -
1.8943 430 1.3111 - - - - -
1.9163 435 1.6604 - - - - -
1.9383 440 1.4004 - - - - -
1.9604 445 2.7186 - - - - -
1.9824 450 2.2757 - - - - -
2.0 454 - 0.9401 0.9433 0.9387 0.9386 0.9416
2.0044 455 0.9345 - - - - -
2.0264 460 0.9325 - - - - -
2.0485 465 1.2434 - - - - -
2.0705 470 1.5161 - - - - -
2.0925 475 2.6011 - - - - -
2.1145 480 1.8276 - - - - -
2.1366 485 1.5005 - - - - -
2.1586 490 0.8618 - - - - -
2.1806 495 2.1422 - - - - -
2.2026 500 1.3922 - - - - -
2.2247 505 1.5939 - - - - -
2.2467 510 1.3021 - - - - -
2.2687 515 1.0825 - - - - -
2.2907 520 0.9066 - - - - -
2.3128 525 0.7717 - - - - -
2.3348 530 1.1484 - - - - -
2.3568 535 1.6513 - - - - -
2.3789 540 1.7267 - - - - -
2.4009 545 0.7659 - - - - -
2.4229 550 2.0213 - - - - -
2.4449 555 0.5329 - - - - -
2.4670 560 1.2083 - - - - -
2.4890 565 1.5432 - - - - -
2.5110 570 0.5423 - - - - -
2.5330 575 0.2613 - - - - -
2.5551 580 0.7985 - - - - -
2.5771 585 0.3003 - - - - -
2.5991 590 2.2234 - - - - -
2.6211 595 0.4772 - - - - -
2.6432 600 1.0158 - - - - -
2.6652 605 2.6385 - - - - -
2.6872 610 0.7042 - - - - -
2.7093 615 1.1469 - - - - -
2.7313 620 1.4092 - - - - -
2.7533 625 0.6487 - - - - -
2.7753 630 1.218 - - - - -
2.7974 635 1.1509 - - - - -
2.8194 640 1.1524 - - - - -
2.8414 645 0.6477 - - - - -
2.8634 650 0.6295 - - - - -
2.8855 655 1.3026 - - - - -
2.9075 660 1.9196 - - - - -
2.9295 665 1.3743 - - - - -
2.9515 670 0.8934 - - - - -
2.9736 675 1.1801 - - - - -
2.9956 680 1.2952 - - - - -
3.0 681 - 0.9538 0.9513 0.9538 0.9414 0.9435
3.0176 685 0.3324 - - - - -
3.0396 690 0.9551 - - - - -
3.0617 695 0.9315 - - - - -
3.0837 700 1.3611 - - - - -
3.1057 705 1.4406 - - - - -
3.1278 710 0.5888 - - - - -
3.1498 715 0.9149 - - - - -
3.1718 720 0.5627 - - - - -
3.1938 725 1.6876 - - - - -
3.2159 730 1.1366 - - - - -
3.2379 735 1.3571 - - - - -
3.2599 740 1.5227 - - - - -
3.2819 745 2.5139 - - - - -
3.3040 750 0.3735 - - - - -
3.3260 755 1.4386 - - - - -
3.3480 760 0.3838 - - - - -
3.3700 765 0.3973 - - - - -
3.3921 770 1.4972 - - - - -
3.4141 775 1.5118 - - - - -
3.4361 780 0.478 - - - - -
3.4581 785 1.5982 - - - - -
3.4802 790 0.6209 - - - - -
3.5022 795 0.5902 - - - - -
3.5242 800 1.0877 - - - - -
3.5463 805 0.9553 - - - - -
3.5683 810 0.3054 - - - - -
3.5903 815 1.2229 - - - - -
3.6123 820 0.7434 - - - - -
3.6344 825 1.5447 - - - - -
3.6564 830 1.0751 - - - - -
3.6784 835 0.8161 - - - - -
3.7004 840 0.4382 - - - - -
3.7225 845 1.3547 - - - - -
3.7445 850 1.7112 - - - - -
3.7665 855 0.5362 - - - - -
3.7885 860 0.9309 - - - - -
3.8106 865 1.8301 - - - - -
3.8326 870 1.5554 - - - - -
3.8546 875 1.4035 - - - - -
3.8767 880 1.5814 - - - - -
3.8987 885 0.7283 - - - - -
3.9207 890 1.8549 - - - - -
3.9427 895 0.196 - - - - -
3.9648 900 1.2072 - - - - -
3.9868 905 0.83 - - - - -
4.0 908 - 0.9564 0.9587 0.9612 0.9488 0.9563
4.0088 910 1.7222 - - - - -
4.0308 915 0.6728 - - - - -
4.0529 920 0.9388 - - - - -
4.0749 925 0.7998 - - - - -
4.0969 930 1.1561 - - - - -
4.1189 935 2.4315 - - - - -
4.1410 940 1.3263 - - - - -
4.1630 945 1.2374 - - - - -
4.1850 950 1.1307 - - - - -
4.2070 955 0.5512 - - - - -
4.2291 960 1.3266 - - - - -
4.2511 965 1.2306 - - - - -
4.2731 970 1.7083 - - - - -
4.2952 975 0.7028 - - - - -
4.3172 980 1.2987 - - - - -
4.3392 985 1.545 - - - - -
4.3612 990 1.004 - - - - -
4.3833 995 0.8276 - - - - -
4.4053 1000 1.4694 - - - - -
4.4273 1005 0.4914 - - - - -
4.4493 1010 0.9894 - - - - -
4.4714 1015 0.8855 - - - - -
4.4934 1020 1.1339 - - - - -
4.5154 1025 1.0786 - - - - -
4.5374 1030 1.2547 - - - - -
4.5595 1035 0.5312 - - - - -
4.5815 1040 1.4938 - - - - -
4.6035 1045 0.8124 - - - - -
4.6256 1050 1.2401 - - - - -
4.6476 1055 1.1902 - - - - -
4.6696 1060 1.4183 - - - - -
4.6916 1065 1.0718 - - - - -
4.7137 1070 1.2203 - - - - -
4.7357 1075 0.8535 - - - - -
4.7577 1080 1.2454 - - - - -
4.7797 1085 0.4216 - - - - -
4.8018 1090 0.8327 - - - - -
4.8238 1095 1.2371 - - - - -
4.8458 1100 1.0949 - - - - -
4.8678 1105 1.2177 - - - - -
4.8899 1110 0.6236 - - - - -
4.9119 1115 0.646 - - - - -
4.9339 1120 1.1822 - - - - -
4.9559 1125 1.0471 - - - - -
4.9780 1130 0.7626 - - - - -
5.0 1135 0.9794 0.9564 0.9563 0.9616 0.9488 0.9587
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}