metadata
base_model: BAAI/bge-large-en
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:416
- loss:CosineSimilarityLoss
widget:
- source_sentence: Omissions and Descrepancies
sentences:
- ' the Railway may incur in reference thereto, shall be charged to the Contractor'
- >-
as to execution or quality of any work or material, or as to the
measurements of the works the decision of the Engineer thereon shall be
final subject to the appeal (within 7 days of such decision being
intimated to the Contractor) to the Chief Engineer
- notify the authority inviting tenders.
- source_sentence: Type of contract
sentences:
- "\_ \_ \_ \_ Third party liability relationship is present in this contract."
- Bill of quantities
- purpose of works either free of cost or pay thecost of the same.
- source_sentence: Project schedules like Bar chart, CPM, PERT
sentences:
- >-
after the date of receipt of the acceptance letter in respect of
contracts with initial completion period of two years or less or not
later than 90 days for other contracts have to submit the detailed
programme of work
- "\_ \_ \_ \_ No certificate other than Maintenance Certificate, if applicable, referred to in Clause 50 of the Conditions shall be deemed to constitute approval"
- "\_All temporary works necessary for the proper execution of the works shall be provided and maintained by the Contractor"
- source_sentence: What is the role of the Engineer in the completion of works?
sentences:
- >-
Conditions of Contract for the completion of works to the entire
satisfaction of the Engineer.
- "\_ \_ \_ \_ Third party liability relationship is present in this contract."
- >-
Any item of work carried out by the Contractor on the instructions of
the Engineer which is not included in the accepted Schedules of Rates
- source_sentence: >-
Members of the entity to which the contract is awarded, shall be jointly
and severally liable to the Railway for execution of the project in
accordance with General and Special Conditions of Contract.
sentences:
- >-
Once having entered into above arrangement, Contractor shall discontinue
such
arrangement, if he intends to do so at his own or on the instructions of
Railway, with
prior intimation to Chief Engineer.
- Does the contract contain a 'third party liability relations' clause?
- >-
All gold, silver, oil, other minerals of any description, all precious
stones, coins, treasures relics antiquities and other similar things
which shall be found in or upon the site shall be the property of the
Railway
SentenceTransformer based on BAAI/bge-large-en
This is a sentence-transformers model finetuned from BAAI/bge-large-en. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-large-en
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Ananthu357/Ananthus-BAAI-for-contracts4.0")
# Run inference
sentences = [
'Members of the entity to which the contract is awarded, shall be jointly and severally liable to the Railway for execution of the project in accordance with General and Special Conditions of Contract.',
"Does the contract contain a 'third party liability relations' clause?",
'All gold, silver, oil, other minerals of any description, all precious stones, coins, treasures relics antiquities and other similar things which shall be found in or upon the site shall be the property of the Railway',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 25warmup_ratio
: 0.1fp16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 25max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss |
---|---|---|---|
3.2692 | 100 | 0.0576 | 0.0560 |
6.5385 | 200 | 0.0069 | 0.0631 |
9.8077 | 300 | 0.0044 | 0.0651 |
13.0769 | 400 | 0.0023 | 0.0567 |
16.1538 | 500 | 0.0013 | 0.0588 |
19.4231 | 600 | 0.001 | 0.0596 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.3.0+cu121
- Accelerate: 0.31.0
- Datasets: 2.20.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}