dipanjanS's picture
Add new SentenceTransformer model.
f2eea1d verified
metadata
base_model: BAAI/bge-base-en-v1.5
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1340
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: Who popularized the term 'Dalit'?
    sentences:
      - >-
        Fakhruddin Ali Ahmed was the fifth President of India from 1974 to 1977
        and also the 2nd President of India to die in office.
      - >-
        Arunachal Pradesh or South Tibet is a state between India and China. The
        country that owns this region is disputed. China says that they own it
        and call it South Tibet (Zangnan 藏南). In 2017, China started renaming
        places in this territory. In 2019 China destroyed 30,000 "incorrect"
        world maps that showed South Tibet as part of India.
      - >-
        "Dalit" refers to socially, economically and historically marginalized
        communities predominantly in India . It also means "broken/scattered" in
        Sanskrit and Hindi . The term "dalits" was in use as a translation for
        the British Raj census classification of "Depressed Classes" prior to
        1935. It was popularised by the economist and reformer B. R. Ambedkar
        (1891–1956), who included all depressed people irrespective of their
        caste into the definition of dalits. Hence the first group he made was
        called the "Labour Party" and included as its members all people of the
        society who were kept depressed, including women, small scale farmers
        and people from backward castes.
  - source_sentence: What is India's contribution to the Olympic Movement?
    sentences:
      - >-
        Prem Pal Singh Rawat (in India called Maharaji and in the past called
        Guru Maharaj Ji and Balyogeshwar) was born in India on December 10,
        1957. He teaches inner peace by the use of what he calls "Knowledge".
        Groups that have helped him are the Divine Light Mission, Elan Vital
        (1983), and The Prem Rawat Foundation (2001).
      - >-
        Boota Singh (Gurmukhi: ਬੂਟਾ ਸਿੰਘ; Shahmukhi: بوٹا سنگھ), sometimes
        spelled as Buta Singh, was a Sikh soldier in the British Army. He served
        in Burma during World War II, under the command of Lord Mountbatten. He
        is very well known in India and Pakistan. He is famous for his tragic
        love story with Zainab, a Muslim girl who he rescued from the riots
        during the partition of India in 1947.
      - >-
        India at the Olympics is a history which includes 32 games in 19
        countries and 800+ athletes. Since 1900, India has contributed to the
        growth of the "Olympic Movement".
  - source_sentence: What is significant about the fort in Jhansi?
    sentences:
      - >-
        Western India is a region of the Republic of India, it includes Gujarat,
        Madhya Pradesh and Maharashtra.
      - >-
        The Government of India Act 1858 was an Act of the Parliament of the
        United Kingdom (21 & 22 Vict. c. 106) passed on August 2, 1858. Its
        provisions called for the liquidation of the British East India Company
        (who had up to this point been ruling British India under the auspices
        of Parliament) and the transference of its functions to the British
        Crown.
      - >-
        Jhansi is a historic city of India between the rivers Pahunj and Betwa
        in the northern state of Uttar Pradesh, close to the border with Madhya
        Pradesh. Jhansi is the administrative headquarters of Jhansi District
        and Jhansi Division. The original walled city grew up around its stone
        fort, which was built in 1613. The city is well connected to all other
        major towns in Uttar Pradesh by road and railway networks. It is called
        "gateway to Bundelkhand". Jhansi was besieged and taken by British
        forces in 1858 during the Indian Rebellion of 1857.
  - source_sentence: How is Dhanteras celebrated in Nepal?
    sentences:
      - >-
        The National Stock Exchange of India Limited (NSE), is a Mumbai-based
        stock exchange. It is the biggest stock exchange in India and the third
        biggest in the world in terms of amounts of transactions. NSE is
        mutually-owned by a set of leading financial institutions, banks,
        insurance companies and other financial intermediaries in India but its
        ownership and management operate as separate groups. As of 2006, the NSE
        VSAT terminals, 2799 in total, cover more than 1500 cities across India.
        In July 2007, the NSE had a total market capitalization of 42,74,509
        crore INR making it the second-largest stock market in South Asia in
        terms of market-capitalization.
      - >-
        Dhanteras (Sanskrit: धनतेरस), also known as Dhanatrayodashi () or
        Dhanvantari Trayodashi, is the first day of the festival of Diwali in
        India and the festival of Tihar in Nepal.
      - >-
        Perur taluk is a taluk in Coimbatore district, Tamil Nadu, India
        associated with the neighbourhood of Perur. It was created by Government
        of Tamil Nadu in 2013.
  - source_sentence: What political roles did Rao hold in Andhra Pradesh?
    sentences:
      - >-
        The 2023 ICC Cricket World Cup is scheduled to be hosted by India and
        India was selected as the host at an International Cricket Council (ICC)
        meeting in London in June 2013. This will be the 13th Cricket World Cup
        competition. It will be the fourth time that India will be the host.
        This will be the first time that India has hosted the tournament on its
        own. India hosted previous World Cup tournaments in 1987 (with
        Pakistan), 1996 (with Pakistan and Sri Lanka) and 2011 (with Sri Lanka
        and Bangladesh). The semi final will be played at Wankhede Stadium. And
        final will be played at Eden Gardens, Kolkata.
      - >-
        Ayyavazhi (, "path of the father"), is a religion with one god that
        started in South India in the middle of the 19th century. The 'zhi' ()
        in the word, 'Ayyavazhi', is a retroflex, ri.
      - >-
        Balli Durga Prasad Rao (15 June 1956 – 16 September 2020) was an Indian
        politician. He was elected to the Lok Sabha, lower house of the
        Parliament of India in the 2019 Indian general election. He was a member
        of the YSR Congress Party. Rao was also a member of the Andhra Pradesh
        MLA from 1985 to 1989, 1994 to 1999, and 2009 to 2014.

SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("dipanjanS/bge-base-en-v1.5-fte")
# Run inference
sentences = [
    'What political roles did Rao hold in Andhra Pradesh?',
    'Balli Durga Prasad Rao (15 June 1956 – 16 September 2020) was an Indian politician. He was elected to the Lok Sabha, lower house of the Parliament of India in the 2019 Indian general election. He was a member of the YSR Congress Party. Rao was also a member of the Andhra Pradesh MLA from 1985 to 1989, 1994 to 1999, and 2009 to 2014.',
    'Ayyavazhi (, "path of the father"), is a religion with one god that started in South India in the middle of the 19th century. The \'zhi\' () in the word, \'Ayyavazhi\', is a retroflex, ri.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,340 training samples
  • Columns: question and context
  • Approximate statistics based on the first 1000 samples:
    question context
    type string string
    details
    • min: 6 tokens
    • mean: 12.39 tokens
    • max: 24 tokens
    • min: 9 tokens
    • mean: 83.99 tokens
    • max: 510 tokens
  • Samples:
    question context
    What is Basil commonly known as? Basil ("Ocimum basilicum") ( or ) is a plant of the Family Lamiaceae. It is also known as Sweet Basil or Tulsi. It is a tender low-growing herb that is grown as a perennial in warm, tropical climates. Basil is originally native to India and other tropical regions of Asia. It has been cultivated there for more than 5,000 years. It is prominently featured in many cuisines throughout the world. Some of them are Italian, Thai, Vietnamese and Laotian cuisines. It grows to between 30–60 cm tall. It has light green, silky leaves 3–5 cm long and 1–3 cm broad. The leaves are opposite each other. The flowers are quite big. They are white in color and arranged as a spike.
    Where is Basil originally native to? Basil ("Ocimum basilicum") ( or ) is a plant of the Family Lamiaceae. It is also known as Sweet Basil or Tulsi. It is a tender low-growing herb that is grown as a perennial in warm, tropical climates. Basil is originally native to India and other tropical regions of Asia. It has been cultivated there for more than 5,000 years. It is prominently featured in many cuisines throughout the world. Some of them are Italian, Thai, Vietnamese and Laotian cuisines. It grows to between 30–60 cm tall. It has light green, silky leaves 3–5 cm long and 1–3 cm broad. The leaves are opposite each other. The flowers are quite big. They are white in color and arranged as a spike.
    What is the significance of the Roerich Pact? The Roerich Pact is a treaty on Protection of Artistic and Scientific Institutions and Historic Monuments, signed by the representatives of 21 states in the Oval Office of the White House on 15 April 1935. As of January 1, 1990, the Roerich Pact had been ratified by ten nations: Brazil, Chile, Colombia, Cuba, the Dominican Republic, El Salvador, Guatemala, Mexico, the United States, and Venezuela. It went into effect on 26 August 1935. The Government of India approved the Treaty in 1948, but did not take any further formal action. The Roerich Pact is also known as "Pax Cultura" ("Cultural Peace" or "Peace through Culture"). The most important part of the Roerich Pact is the legal recognition that the protection of culture is always more important than any military necessity.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 100 evaluation samples
  • Columns: question and context
  • Approximate statistics based on the first 1000 samples:
    question context
    type string string
    details
    • min: 7 tokens
    • mean: 12.36 tokens
    • max: 19 tokens
    • min: 12 tokens
    • mean: 84.15 tokens
    • max: 235 tokens
  • Samples:
    question context
    What is the demographic composition of Kolathur? Kolathur () is a town in Salem district in the Indian state of Tamil Nadu. As of the 2001 India census, Kolathur had a population of 10,319. Males make up 53% of the population and females 47%. A total of 9% of the population is under 6 years of age.
    What is notable about India's democracy? India is a country in Asia. It has an area of . It is at the center of South Asia. India has more than 1.2 billion (1,210,000,000) people, which is the second largest population in the world. It is the seventh largest country in the world by area and the largest country in South Asia. It is also the most populous democracy in the world.
    Who was the Chief Justice of India before Dipak Misra? Justice Dipak Misra (born 3 October 1953) was the Judge of the Supreme Court and the Chief Justice of India. He took over as the 45th Chief Justice of India (CJI), succeeding the 44th CJI, Justice J. S. Khehar.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 3e-06
  • max_steps: 332
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 3e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: 332
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss
0.2381 20 0.1832 0.0491
0.4762 40 0.1118 0.0246
0.7143 60 0.0991 0.0152
0.9524 80 0.0518 0.0106
1.1905 100 0.0665 0.0073
1.4286 120 0.0539 0.0058
1.6667 140 0.0548 0.0048
1.9048 160 0.0354 0.0041
2.1429 180 0.038 0.0034
2.3810 200 0.0592 0.0030
2.6190 220 0.0203 0.0027
2.8571 240 0.0441 0.0025
3.0952 260 0.023 0.0024
3.3333 280 0.0452 0.0023
3.5714 300 0.0128 0.0022
3.8095 320 0.0495 0.0022

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}