Ananthu357 commited on
Commit
8321e0f
1 Parent(s): 8400ec4

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,350 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-large-en
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:574
13
+ - loss:CosineSimilarityLoss
14
+ widget:
15
+ - source_sentence: What is mentioned regarding the patent errors?
16
+ sentences:
17
+ - the Schedule to the Indian Medical Council Act
18
+ - shall take upon himself and provide for the risk of any error which may subsequently
19
+ be discovered and shall make no subsequent claim on account thereof.
20
+ - Omissions and Descrepancies
21
+ - source_sentence: Is there a way to claim consequential losses?
22
+ sentences:
23
+ - The Railway reserves the right of not to invite tenders for any of Railway work
24
+ or works or to invite open or limited tenders
25
+ - entitle the Contractor to damages or compensation therefor, but in any such case,
26
+ the Railway may grant such extension or extensions of the completion date as may
27
+ be considered reasonable.
28
+ - "The Railway shall have the right to let other contracts in connection with the\
29
+ \ works. The Contractor shall afford other Contractors reasonable opportunity\
30
+ \ for the storage of their materials and the execution of their works and shall\
31
+ \ properly connect and coordinate his work with theirs. If any part of the Contractor\x92\
32
+ s work depends upon proper execution or result upon the work of another Contractor(s),\
33
+ \ the Contractor shall inspect and promptly report to the Engineer any defects\
34
+ \ in such works that render it unsuitable for such proper execution and results.\
35
+ \ The Contractor's failure so-to inspect and report shall constitute an acceptance\
36
+ \ of the other Contractor's work as fit and proper for the reception of his work,\
37
+ \ except as to defects which may develop in the other Contractor's work after\
38
+ \ the execution of his work."
39
+ - source_sentence: Does the contract document contain a indemnification clause provision?
40
+ sentences:
41
+ - The partners of the firm to which the Letter of Acceptance (LOA) is issued, shall
42
+ be jointly and severally liable to the Railway for execution of the contract
43
+ - Contractor awarded the work shall submit a detailed program of work indicating
44
+ the time schedule
45
+ - All notices, communications, reference and complaints made by the Railway or the
46
+ Engineer or the Engineer's Representative or the Contractor inter-se concerning
47
+ the works shall be in writing or e-mail on registered e mail IDs and no notice,
48
+ communication, reference or complaint not in writing or through e-mail, shall
49
+ be recognized.
50
+ - source_sentence: Force Majeure
51
+ sentences:
52
+ - These Regulations for Tenders and Contracts shall be read in conjunction with
53
+ the Standard General Conditions of Contract which are referred to herein and shall
54
+ be subject to modifications additions or suppression by Special Conditions of
55
+ Contract and/or Special Specifications, if any, annexed to the Tender Forms.
56
+ - Act of God
57
+ - 'Instructions: The Engineer shall direct the order in which the several parts
58
+ of the works shall be executed'
59
+ - source_sentence: Interpretation of Standard General Conditions of contract
60
+ sentences:
61
+ - The Contractor shall at his own expense provide himself with sheds, storehouses
62
+ and yards in such situations and in such numbers as in the opinion of the Engineer
63
+ is requisite for carrying on the works and the Contractor
64
+ - What are the additional documents that have to be read along with the Standard
65
+ General Conditions of Contract?
66
+ - the necessity arises for the execution of such items of works that the accepted
67
+ Schedule of Rates does not include rate or rates for the extra work involved.
68
+ ---
69
+
70
+ # SentenceTransformer based on BAAI/bge-large-en
71
+
72
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
73
+
74
+ ## Model Details
75
+
76
+ ### Model Description
77
+ - **Model Type:** Sentence Transformer
78
+ - **Base model:** [BAAI/bge-large-en](https://huggingface.co/BAAI/bge-large-en) <!-- at revision abe7d9d814b775ca171121fb03f394dc42974275 -->
79
+ - **Maximum Sequence Length:** 512 tokens
80
+ - **Output Dimensionality:** 1024 tokens
81
+ - **Similarity Function:** Cosine Similarity
82
+ <!-- - **Training Dataset:** Unknown -->
83
+ <!-- - **Language:** Unknown -->
84
+ <!-- - **License:** Unknown -->
85
+
86
+ ### Model Sources
87
+
88
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
89
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
90
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
91
+
92
+ ### Full Model Architecture
93
+
94
+ ```
95
+ SentenceTransformer(
96
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
97
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
98
+ (2): Normalize()
99
+ )
100
+ ```
101
+
102
+ ## Usage
103
+
104
+ ### Direct Usage (Sentence Transformers)
105
+
106
+ First install the Sentence Transformers library:
107
+
108
+ ```bash
109
+ pip install -U sentence-transformers
110
+ ```
111
+
112
+ Then you can load this model and run inference.
113
+ ```python
114
+ from sentence_transformers import SentenceTransformer
115
+
116
+ # Download from the 🤗 Hub
117
+ model = SentenceTransformer("Ananthu357/Ananthus-BAAI-for-contracts7.0")
118
+ # Run inference
119
+ sentences = [
120
+ 'Interpretation of Standard General Conditions of contract',
121
+ 'What are the additional documents that have to be read along with the Standard General Conditions of Contract?',
122
+ 'The Contractor shall at his own expense provide himself with sheds, storehouses and yards in such situations and in such numbers as in the opinion of the Engineer is requisite for carrying on the works and the Contractor',
123
+ ]
124
+ embeddings = model.encode(sentences)
125
+ print(embeddings.shape)
126
+ # [3, 1024]
127
+
128
+ # Get the similarity scores for the embeddings
129
+ similarities = model.similarity(embeddings, embeddings)
130
+ print(similarities.shape)
131
+ # [3, 3]
132
+ ```
133
+
134
+ <!--
135
+ ### Direct Usage (Transformers)
136
+
137
+ <details><summary>Click to see the direct usage in Transformers</summary>
138
+
139
+ </details>
140
+ -->
141
+
142
+ <!--
143
+ ### Downstream Usage (Sentence Transformers)
144
+
145
+ You can finetune this model on your own dataset.
146
+
147
+ <details><summary>Click to expand</summary>
148
+
149
+ </details>
150
+ -->
151
+
152
+ <!--
153
+ ### Out-of-Scope Use
154
+
155
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
156
+ -->
157
+
158
+ <!--
159
+ ## Bias, Risks and Limitations
160
+
161
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
162
+ -->
163
+
164
+ <!--
165
+ ### Recommendations
166
+
167
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
168
+ -->
169
+
170
+ ## Training Details
171
+
172
+ ### Training Hyperparameters
173
+ #### Non-Default Hyperparameters
174
+
175
+ - `eval_strategy`: steps
176
+ - `per_device_train_batch_size`: 16
177
+ - `per_device_eval_batch_size`: 16
178
+ - `num_train_epochs`: 15
179
+ - `warmup_ratio`: 0.1
180
+ - `fp16`: True
181
+ - `batch_sampler`: no_duplicates
182
+
183
+ #### All Hyperparameters
184
+ <details><summary>Click to expand</summary>
185
+
186
+ - `overwrite_output_dir`: False
187
+ - `do_predict`: False
188
+ - `eval_strategy`: steps
189
+ - `prediction_loss_only`: True
190
+ - `per_device_train_batch_size`: 16
191
+ - `per_device_eval_batch_size`: 16
192
+ - `per_gpu_train_batch_size`: None
193
+ - `per_gpu_eval_batch_size`: None
194
+ - `gradient_accumulation_steps`: 1
195
+ - `eval_accumulation_steps`: None
196
+ - `learning_rate`: 5e-05
197
+ - `weight_decay`: 0.0
198
+ - `adam_beta1`: 0.9
199
+ - `adam_beta2`: 0.999
200
+ - `adam_epsilon`: 1e-08
201
+ - `max_grad_norm`: 1.0
202
+ - `num_train_epochs`: 15
203
+ - `max_steps`: -1
204
+ - `lr_scheduler_type`: linear
205
+ - `lr_scheduler_kwargs`: {}
206
+ - `warmup_ratio`: 0.1
207
+ - `warmup_steps`: 0
208
+ - `log_level`: passive
209
+ - `log_level_replica`: warning
210
+ - `log_on_each_node`: True
211
+ - `logging_nan_inf_filter`: True
212
+ - `save_safetensors`: True
213
+ - `save_on_each_node`: False
214
+ - `save_only_model`: False
215
+ - `restore_callback_states_from_checkpoint`: False
216
+ - `no_cuda`: False
217
+ - `use_cpu`: False
218
+ - `use_mps_device`: False
219
+ - `seed`: 42
220
+ - `data_seed`: None
221
+ - `jit_mode_eval`: False
222
+ - `use_ipex`: False
223
+ - `bf16`: False
224
+ - `fp16`: True
225
+ - `fp16_opt_level`: O1
226
+ - `half_precision_backend`: auto
227
+ - `bf16_full_eval`: False
228
+ - `fp16_full_eval`: False
229
+ - `tf32`: None
230
+ - `local_rank`: 0
231
+ - `ddp_backend`: None
232
+ - `tpu_num_cores`: None
233
+ - `tpu_metrics_debug`: False
234
+ - `debug`: []
235
+ - `dataloader_drop_last`: False
236
+ - `dataloader_num_workers`: 0
237
+ - `dataloader_prefetch_factor`: None
238
+ - `past_index`: -1
239
+ - `disable_tqdm`: False
240
+ - `remove_unused_columns`: True
241
+ - `label_names`: None
242
+ - `load_best_model_at_end`: False
243
+ - `ignore_data_skip`: False
244
+ - `fsdp`: []
245
+ - `fsdp_min_num_params`: 0
246
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
247
+ - `fsdp_transformer_layer_cls_to_wrap`: None
248
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
249
+ - `deepspeed`: None
250
+ - `label_smoothing_factor`: 0.0
251
+ - `optim`: adamw_torch
252
+ - `optim_args`: None
253
+ - `adafactor`: False
254
+ - `group_by_length`: False
255
+ - `length_column_name`: length
256
+ - `ddp_find_unused_parameters`: None
257
+ - `ddp_bucket_cap_mb`: None
258
+ - `ddp_broadcast_buffers`: False
259
+ - `dataloader_pin_memory`: True
260
+ - `dataloader_persistent_workers`: False
261
+ - `skip_memory_metrics`: True
262
+ - `use_legacy_prediction_loop`: False
263
+ - `push_to_hub`: False
264
+ - `resume_from_checkpoint`: None
265
+ - `hub_model_id`: None
266
+ - `hub_strategy`: every_save
267
+ - `hub_private_repo`: False
268
+ - `hub_always_push`: False
269
+ - `gradient_checkpointing`: False
270
+ - `gradient_checkpointing_kwargs`: None
271
+ - `include_inputs_for_metrics`: False
272
+ - `eval_do_concat_batches`: True
273
+ - `fp16_backend`: auto
274
+ - `push_to_hub_model_id`: None
275
+ - `push_to_hub_organization`: None
276
+ - `mp_parameters`:
277
+ - `auto_find_batch_size`: False
278
+ - `full_determinism`: False
279
+ - `torchdynamo`: None
280
+ - `ray_scope`: last
281
+ - `ddp_timeout`: 1800
282
+ - `torch_compile`: False
283
+ - `torch_compile_backend`: None
284
+ - `torch_compile_mode`: None
285
+ - `dispatch_batches`: None
286
+ - `split_batches`: None
287
+ - `include_tokens_per_second`: False
288
+ - `include_num_input_tokens_seen`: False
289
+ - `neftune_noise_alpha`: None
290
+ - `optim_target_modules`: None
291
+ - `batch_eval_metrics`: False
292
+ - `eval_on_start`: False
293
+ - `batch_sampler`: no_duplicates
294
+ - `multi_dataset_batch_sampler`: proportional
295
+
296
+ </details>
297
+
298
+ ### Training Logs
299
+ | Epoch | Step | Training Loss | loss |
300
+ |:-------:|:----:|:-------------:|:------:|
301
+ | 2.7778 | 100 | 0.0571 | 0.0611 |
302
+ | 5.5556 | 200 | 0.0073 | 0.0604 |
303
+ | 8.3333 | 300 | 0.0031 | 0.0578 |
304
+ | 11.1111 | 400 | 0.0019 | 0.0589 |
305
+ | 13.8889 | 500 | 0.0013 | 0.0587 |
306
+
307
+
308
+ ### Framework Versions
309
+ - Python: 3.10.12
310
+ - Sentence Transformers: 3.0.1
311
+ - Transformers: 4.42.4
312
+ - PyTorch: 2.3.1+cu121
313
+ - Accelerate: 0.32.1
314
+ - Datasets: 2.21.0
315
+ - Tokenizers: 0.19.1
316
+
317
+ ## Citation
318
+
319
+ ### BibTeX
320
+
321
+ #### Sentence Transformers
322
+ ```bibtex
323
+ @inproceedings{reimers-2019-sentence-bert,
324
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
325
+ author = "Reimers, Nils and Gurevych, Iryna",
326
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
327
+ month = "11",
328
+ year = "2019",
329
+ publisher = "Association for Computational Linguistics",
330
+ url = "https://arxiv.org/abs/1908.10084",
331
+ }
332
+ ```
333
+
334
+ <!--
335
+ ## Glossary
336
+
337
+ *Clearly define terms in order to be accessible across audiences.*
338
+ -->
339
+
340
+ <!--
341
+ ## Model Card Authors
342
+
343
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
344
+ -->
345
+
346
+ <!--
347
+ ## Model Card Contact
348
+
349
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
350
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-large-en",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 4096,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 16,
24
+ "num_hidden_layers": 24,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.42.4",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e863d8533742461dc7d4bab840b081758d68f56b5418cf9aac5143a18afd7874
3
+ size 1340612432
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff