Ananthu357 commited on
Commit
9b4c43d
1 Parent(s): e5280d3

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,345 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-large-en
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:649
13
+ - loss:CosineSimilarityLoss
14
+ widget:
15
+ - source_sentence: Dispute resolution
16
+ sentences:
17
+ - Arbitration and Conciliation (Amendment) Act 2015, if they agree for such waiver
18
+ in writing, after dispute having arisen between them, in the format
19
+ - The Earnest Money shall be deposited in cash through e-payment gateway or as mentioned
20
+ in tender documents.
21
+ - of liquidated damages under this condition shall not exceed 5% of the contract
22
+ value
23
+ - source_sentence: Order of Precedence is the order with which preference should be
24
+ given to the documents.
25
+ sentences:
26
+ - the sand, stone, clay ballast, earth, trees, rock
27
+ - in case of any difference, contradiction, discrepancy, with regard to conditions
28
+ of tender/contract,
29
+ - If the tenderer(s) deliberately gives / give wrong information in his / their
30
+ tender or creates / create circumstances for the acceptance of his / their tender,
31
+ the Railway reserves the right to reject such tender at any stage.
32
+ - source_sentence: Does the contract document contain a 'third-party liability relationship'
33
+ provision?
34
+ sentences:
35
+ - The Contractor shall be responsible for all risk to the work and for trespass
36
+ and shall make good at his own expense all loss or damage whether to the works
37
+ themselves or to any other property of the Railway or the lives, persons or property
38
+ of other
39
+ - This program should indicate the time schedule for various work items in the form
40
+ of a Bar Chart/PERT/CPM.
41
+ -  Completion indiacted by issuance of maintenance certifciate
42
+ - source_sentence: What is the impact of breaching the contract conditions on subcontracting?
43
+ sentences:
44
+ - Schedule of Rates
45
+ - What determines the completion of the contract.
46
+ - shall not assign or sublet the contract or any part thereof or allow any person
47
+ - source_sentence: Bonus for early completion of work
48
+ sentences:
49
+ - 'as to execution or quality of any work or material, or as to the measurements
50
+ of the works the decision of the Engineer thereon shall be final subject to the
51
+ appeal (within 7 days of such decision being intimated to the Contractor) to the
52
+ Chief Engineer '
53
+ - The maximum bonus shall be limited to 3% of original contract value.
54
+ - The Contractor shall indemnify and save harmless the Railway from and against
55
+ all actions, suit, proceedings, losses, costs, damages, charges, claims and demands
56
+ of every nature and description brought or recovered against the Railways by reason
57
+ of any act or omission of the Contractor, his agents or employees, in the execution
58
+ of the works or in his guarding of the same. All sums payable by way of compensation
59
+ under any of these conditions shall be considered as reasonable compensation to
60
+ be applied to the actual loss or damage sustained, and whether or not any damage
61
+ shall have been sustained.
62
+ ---
63
+
64
+ # SentenceTransformer based on BAAI/bge-large-en
65
+
66
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
67
+
68
+ ## Model Details
69
+
70
+ ### Model Description
71
+ - **Model Type:** Sentence Transformer
72
+ - **Base model:** [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en) <!-- at revision abe7d9d814b775ca171121fb03f394dc42974275 -->
73
+ - **Maximum Sequence Length:** 512 tokens
74
+ - **Output Dimensionality:** 1024 tokens
75
+ - **Similarity Function:** Cosine Similarity
76
+ <!-- - **Training Dataset:** Unknown -->
77
+ <!-- - **Language:** Unknown -->
78
+ <!-- - **License:** Unknown -->
79
+
80
+ ### Model Sources
81
+
82
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
83
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
84
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
85
+
86
+ ### Full Model Architecture
87
+
88
+ ```
89
+ SentenceTransformer(
90
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
91
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
92
+ (2): Normalize()
93
+ )
94
+ ```
95
+
96
+ ## Usage
97
+
98
+ ### Direct Usage (Sentence Transformers)
99
+
100
+ First install the Sentence Transformers library:
101
+
102
+ ```bash
103
+ pip install -U sentence-transformers
104
+ ```
105
+
106
+ Then you can load this model and run inference.
107
+ ```python
108
+ from sentence_transformers import SentenceTransformer
109
+
110
+ # Download from the 🤗 Hub
111
+ model = SentenceTransformer("Ananthu357/Ananthus-BAAI-for-contracts11.0")
112
+ # Run inference
113
+ sentences = [
114
+ 'Bonus for early completion of work',
115
+ 'The maximum bonus shall be limited to 3% of original contract value.',
116
+ 'The Contractor shall indemnify and save harmless the Railway from and against all actions, suit, proceedings, losses, costs, damages, charges, claims and demands of every nature and description brought or recovered against the Railways by reason of any act or omission of the Contractor, his agents or employees, in the execution of the works or in his guarding of the same. All sums payable by way of compensation under any of these conditions shall be considered as reasonable compensation to be applied to the actual loss or damage sustained, and whether or not any damage shall have been sustained.',
117
+ ]
118
+ embeddings = model.encode(sentences)
119
+ print(embeddings.shape)
120
+ # [3, 1024]
121
+
122
+ # Get the similarity scores for the embeddings
123
+ similarities = model.similarity(embeddings, embeddings)
124
+ print(similarities.shape)
125
+ # [3, 3]
126
+ ```
127
+
128
+ <!--
129
+ ### Direct Usage (Transformers)
130
+
131
+ <details><summary>Click to see the direct usage in Transformers</summary>
132
+
133
+ </details>
134
+ -->
135
+
136
+ <!--
137
+ ### Downstream Usage (Sentence Transformers)
138
+
139
+ You can finetune this model on your own dataset.
140
+
141
+ <details><summary>Click to expand</summary>
142
+
143
+ </details>
144
+ -->
145
+
146
+ <!--
147
+ ### Out-of-Scope Use
148
+
149
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
150
+ -->
151
+
152
+ <!--
153
+ ## Bias, Risks and Limitations
154
+
155
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
156
+ -->
157
+
158
+ <!--
159
+ ### Recommendations
160
+
161
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
162
+ -->
163
+
164
+ ## Training Details
165
+
166
+ ### Training Hyperparameters
167
+ #### Non-Default Hyperparameters
168
+
169
+ - `eval_strategy`: steps
170
+ - `per_device_train_batch_size`: 16
171
+ - `per_device_eval_batch_size`: 16
172
+ - `num_train_epochs`: 15
173
+ - `warmup_ratio`: 0.1
174
+ - `fp16`: True
175
+ - `batch_sampler`: no_duplicates
176
+
177
+ #### All Hyperparameters
178
+ <details><summary>Click to expand</summary>
179
+
180
+ - `overwrite_output_dir`: False
181
+ - `do_predict`: False
182
+ - `eval_strategy`: steps
183
+ - `prediction_loss_only`: True
184
+ - `per_device_train_batch_size`: 16
185
+ - `per_device_eval_batch_size`: 16
186
+ - `per_gpu_train_batch_size`: None
187
+ - `per_gpu_eval_batch_size`: None
188
+ - `gradient_accumulation_steps`: 1
189
+ - `eval_accumulation_steps`: None
190
+ - `learning_rate`: 5e-05
191
+ - `weight_decay`: 0.0
192
+ - `adam_beta1`: 0.9
193
+ - `adam_beta2`: 0.999
194
+ - `adam_epsilon`: 1e-08
195
+ - `max_grad_norm`: 1.0
196
+ - `num_train_epochs`: 15
197
+ - `max_steps`: -1
198
+ - `lr_scheduler_type`: linear
199
+ - `lr_scheduler_kwargs`: {}
200
+ - `warmup_ratio`: 0.1
201
+ - `warmup_steps`: 0
202
+ - `log_level`: passive
203
+ - `log_level_replica`: warning
204
+ - `log_on_each_node`: True
205
+ - `logging_nan_inf_filter`: True
206
+ - `save_safetensors`: True
207
+ - `save_on_each_node`: False
208
+ - `save_only_model`: False
209
+ - `restore_callback_states_from_checkpoint`: False
210
+ - `no_cuda`: False
211
+ - `use_cpu`: False
212
+ - `use_mps_device`: False
213
+ - `seed`: 42
214
+ - `data_seed`: None
215
+ - `jit_mode_eval`: False
216
+ - `use_ipex`: False
217
+ - `bf16`: False
218
+ - `fp16`: True
219
+ - `fp16_opt_level`: O1
220
+ - `half_precision_backend`: auto
221
+ - `bf16_full_eval`: False
222
+ - `fp16_full_eval`: False
223
+ - `tf32`: None
224
+ - `local_rank`: 0
225
+ - `ddp_backend`: None
226
+ - `tpu_num_cores`: None
227
+ - `tpu_metrics_debug`: False
228
+ - `debug`: []
229
+ - `dataloader_drop_last`: False
230
+ - `dataloader_num_workers`: 0
231
+ - `dataloader_prefetch_factor`: None
232
+ - `past_index`: -1
233
+ - `disable_tqdm`: False
234
+ - `remove_unused_columns`: True
235
+ - `label_names`: None
236
+ - `load_best_model_at_end`: False
237
+ - `ignore_data_skip`: False
238
+ - `fsdp`: []
239
+ - `fsdp_min_num_params`: 0
240
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
241
+ - `fsdp_transformer_layer_cls_to_wrap`: None
242
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
243
+ - `deepspeed`: None
244
+ - `label_smoothing_factor`: 0.0
245
+ - `optim`: adamw_torch
246
+ - `optim_args`: None
247
+ - `adafactor`: False
248
+ - `group_by_length`: False
249
+ - `length_column_name`: length
250
+ - `ddp_find_unused_parameters`: None
251
+ - `ddp_bucket_cap_mb`: None
252
+ - `ddp_broadcast_buffers`: False
253
+ - `dataloader_pin_memory`: True
254
+ - `dataloader_persistent_workers`: False
255
+ - `skip_memory_metrics`: True
256
+ - `use_legacy_prediction_loop`: False
257
+ - `push_to_hub`: False
258
+ - `resume_from_checkpoint`: None
259
+ - `hub_model_id`: None
260
+ - `hub_strategy`: every_save
261
+ - `hub_private_repo`: False
262
+ - `hub_always_push`: False
263
+ - `gradient_checkpointing`: False
264
+ - `gradient_checkpointing_kwargs`: None
265
+ - `include_inputs_for_metrics`: False
266
+ - `eval_do_concat_batches`: True
267
+ - `fp16_backend`: auto
268
+ - `push_to_hub_model_id`: None
269
+ - `push_to_hub_organization`: None
270
+ - `mp_parameters`:
271
+ - `auto_find_batch_size`: False
272
+ - `full_determinism`: False
273
+ - `torchdynamo`: None
274
+ - `ray_scope`: last
275
+ - `ddp_timeout`: 1800
276
+ - `torch_compile`: False
277
+ - `torch_compile_backend`: None
278
+ - `torch_compile_mode`: None
279
+ - `dispatch_batches`: None
280
+ - `split_batches`: None
281
+ - `include_tokens_per_second`: False
282
+ - `include_num_input_tokens_seen`: False
283
+ - `neftune_noise_alpha`: None
284
+ - `optim_target_modules`: None
285
+ - `batch_eval_metrics`: False
286
+ - `eval_on_start`: False
287
+ - `batch_sampler`: no_duplicates
288
+ - `multi_dataset_batch_sampler`: proportional
289
+
290
+ </details>
291
+
292
+ ### Training Logs
293
+ | Epoch | Step | Training Loss | loss |
294
+ |:-------:|:----:|:-------------:|:------:|
295
+ | 2.4390 | 100 | 0.0672 | 0.0435 |
296
+ | 4.8780 | 200 | 0.0132 | 0.0396 |
297
+ | 7.3171 | 300 | 0.0052 | 0.0404 |
298
+ | 9.7561 | 400 | 0.0027 | 0.0419 |
299
+ | 12.1951 | 500 | 0.002 | 0.0420 |
300
+ | 14.6341 | 600 | 0.0014 | 0.0423 |
301
+
302
+
303
+ ### Framework Versions
304
+ - Python: 3.10.12
305
+ - Sentence Transformers: 3.0.1
306
+ - Transformers: 4.42.4
307
+ - PyTorch: 2.3.1+cu121
308
+ - Accelerate: 0.32.1
309
+ - Datasets: 2.21.0
310
+ - Tokenizers: 0.19.1
311
+
312
+ ## Citation
313
+
314
+ ### BibTeX
315
+
316
+ #### Sentence Transformers
317
+ ```bibtex
318
+ @inproceedings{reimers-2019-sentence-bert,
319
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
320
+ author = "Reimers, Nils and Gurevych, Iryna",
321
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
322
+ month = "11",
323
+ year = "2019",
324
+ publisher = "Association for Computational Linguistics",
325
+ url = "https://arxiv.org/abs/1908.10084",
326
+ }
327
+ ```
328
+
329
+ <!--
330
+ ## Glossary
331
+
332
+ *Clearly define terms in order to be accessible across audiences.*
333
+ -->
334
+
335
+ <!--
336
+ ## Model Card Authors
337
+
338
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
339
+ -->
340
+
341
+ <!--
342
+ ## Model Card Contact
343
+
344
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
345
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-large-en",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 4096,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 16,
24
+ "num_hidden_layers": 24,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.42.4",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e95a387d8b909189100334df7df1de5787c3aa2cd7460f272bece7425b739304
3
+ size 1340612432
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff