BGE base En v1.5 Phase 5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("RishuD7/bge-base-en-v1.5-76-keys-phase-6-exp_v1")
# Run inference
sentences = [
    '31. HOLDING OVER. If Tenant remains in possession of the Leased Premises after\nexpiration of the Term, or after any termination of the Lease by Landlord without written agreement\nbetween the parties, Tenant shall be a tenant at sufferance and such tenancy shall be subject to the\nprovisions hereof, except that Rent for said holdover period shall be one hundred fifty percent (150%) of\nthe amount of Rent due in the last month of the Term. Nothing in this Section 29 shall be construed as\nconsent by Landlord to the possession of the Leased Premises by Tenant after the expiration of the Term\nor termination of the Lease by Landlord. ',
    'Holding Rent',
    'Does landlord confirm to no eminent domain on the property',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.011
cosine_accuracy@3 0.0228
cosine_accuracy@5 0.036
cosine_accuracy@10 0.0772
cosine_precision@1 0.011
cosine_precision@3 0.0076
cosine_precision@5 0.0072
cosine_precision@10 0.0077
cosine_recall@1 0.011
cosine_recall@3 0.0228
cosine_recall@5 0.036
cosine_recall@10 0.0772
cosine_ndcg@10 0.0362
cosine_mrr@10 0.0243
cosine_map@100 0.0367

Training Details

Training Dataset

Unnamed Dataset

  • Size: 8,290 training samples
  • Columns: positive and anchor
  • Approximate statistics based on the first 1000 samples:
    positive anchor
    type string string
    details
    • min: 98 tokens
    • mean: 298.43 tokens
    • max: 512 tokens
    • min: 4 tokens
    • mean: 5.99 tokens
    • max: 12 tokens
  • Samples:
    positive anchor
    The Landlord shall have the right, at any time during the Term, to relocate the Premises to other premises (the "New Premises") in the Development on the same terms and conditions as are set out in this Lease provided that: (a) the Landlord shall first have given not less than 90 days notice to the Tenant; (b) the Landlord shall endeavour to ensure that the New Premises be of comparable size and quality to the Premises; (c) the Landlord shall pay the reasonable costs incurred by the Tenant for: (i) its physical move; (ii) the reconnection of existing communication lines; and (iii) the reordering of new printed material plates and the printing of an equal quantity and quality of printed material the tenant has in stock as the time of the relocation; (d) if the Rentable Area of the New Premises is not the same as the Rentable Area of the Premises, the total Basic Rent payable under this Lease (but not the Basic Rent per square foot of Rentable Area) shall be adjusted accordingly; and (e)... Right to Relocate
    39. Holdover: If Tenant shall hold over after the expiration of the Lease Term, without written agreement providing otherwise, Tenant shall be deemed to be a tenant at sufferance on month to month basis, at a monthly rental, payable in advance, equal to double the base rent then being paid by Tenant, and Tenant shall be bound by all of the other terms, covenants and agreements of the Lease. Nothing contained herein shall be construed to give Tenant the right to hold over at any time, extend the Term or prevent Landlord from immediate recovery of possession of the Premises by summary proceedings or otherwise and Landlord may exercise any and all remedies at law or in equity to recover possession of the Premises, as well as any damages incurred by Landlord, by Tenant's failure to vacate the Premises and deliver possession to Landlord as herein provided. Holding Over
    30. HOLDING OVER. If Tenant remains in possession of the Leased Premises after expiration of the Term, or after any termination of the Lease by Landlord without written agreement between the parties, Tenant shall be a tenant at sufferance and such tenancy shall be subject to the provisions hereof, except that Gross Rent for said holdover period shall be one hundred twenty five percent (125%) of the amount of Gross Rent due in the last month of the Term. Nothing in this Section 30 shall be construed as consent by Landlord to the possession of the Leased Premises by Tenant after the expiration of the Term or termination of the Lease by Landlord. In the event Tenant provides written notice to Landlord of its intent to holdover at least sixty (60) days prior to the end of the Term and Landlord does not object to such request within thirty (30) days after receipt thereof, it shall be deemed that Landlord has consented to such holdover and this Lease shall continue on a month-to-month basis ... Holding Over
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 30
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • tf32: False
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 30
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: False
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10
0.6154 10 2.5422 -
1.2308 20 1.3661 -
1.8462 30 0.1879 -
2.4615 40 0.0 -
3.0769 50 0.0 -
3.3846 55 - 0.0252
1.2846 60 0.8868 -
1.9 70 1.4243 -
2.5154 80 0.1644 -
3.1308 90 0.0041 -
3.7462 100 0.0 -
4.3615 110 0.0 0.0301
2.5692 120 1.0665 -
3.1846 130 0.4817 -
3.8 140 0.0021 -
4.4154 150 0.0 -
5.0308 160 0.0 -
5.4 166 - 0.0328
3.2385 170 0.4318 -
3.8538 180 0.7595 -
4.4692 190 0.0737 -
5.0846 200 0.0004 -
5.7 210 0.0 -
6.3154 220 0.0 -
6.3769 221 - 0.0354
4.5231 230 0.736 -
5.1385 240 0.3332 -
5.7538 250 0.0008 -
6.3692 260 0.0 -
6.9846 270 0.0 -
7.3538 276 - 0.0336
5.1923 280 0.3014 -
5.8077 290 0.5931 -
6.4231 300 0.0735 -
7.0385 310 0.0002 -
7.6538 320 0.0 -
8.2692 330 0.0 -
8.3923 332 - 0.0374
6.4769 340 0.5984 -
7.0923 350 0.2797 -
7.7077 360 0.0005 -
8.3231 370 0.0 -
8.9385 380 0.0 -
9.3692 387 - 0.0355
7.1462 390 0.1997 -
7.7615 400 0.5201 -
8.3769 410 0.0799 -
8.9923 420 0.0001 -
9.6077 430 0.0 -
10.2231 440 0.0 -
10.4077 443 - 0.0362
8.4308 450 0.5072 -
9.0462 460 0.2583 -
9.6615 470 0.0005 -
10.2769 480 0.0 0.0362
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.3.1
  • Transformers: 4.43.1
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.2.1
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
20
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for RishuD7/bge-base-en-v1.5-76-keys-phase-6-exp_v1

Finetuned
(331)
this model

Evaluation results