moritzglnr
/

bge-base-financial-matryoshka

@@ -31,50 +31,48 @@ tags:
 - loss:MatryoshkaLoss
 - loss:MultipleNegativesRankingLoss
 widget:
-- source_sentence: R&D expense increased by $304 million, or 14.9%, led by Intelligent
-    Edge, HPC & AI and Storage in fiscal 2023.
   sentences:
-  - What was the growth rate of Visa Inc.'s overall total nominal volume from 2021
-    to 2022?
-  - How much did Hewlett Packard Enterprise's R&D expenses increase in fiscal 2023?
-  - What is the purpose of the Global Day of Joy at Hasbro?
-- source_sentence: In 2022 and continuing into 2023, the Russia-Ukraine conflict was
-    a catalyst for an energy crisis in Europe. Government interventions related to
-    the energy crisis resulting from the Russia-Ukraine conflict, such as the Market
-    Correction Mechanism (price cap), or interventions that may be proposed in the
-    future related to the Russia-Ukraine conflict or the conflict in Israel and Gaza
-    could also have a negative impact on our business.
   sentences:
-  - What are Garmin's core strategies for reducing its environmental impact?
-  - What are the potential consequences of the Russia-Ukraine conflict on a company's
-    business?
-  - What factors influence HP's critical accounting estimates?
-- source_sentence: The increase in other income, net was primarily due to an increase
-    in interest income as a result of higher cash balances and higher interest rates.
   sentences:
-  - What was the primary reason for the increase in other income, net during the noted
-    period?
-  - What led to the increase in room expenses at Las Vegas Sands Corp. in 2023?
-  - What was the provision for income taxes for the year ended June 30, 2023?
-- source_sentence: When an investment declines below cost basis, management evaluates
-    whether the decline in fair value is other than temporary. If deemed other than
-    temporary, an impairment charge is recorded.
   sentences:
-  - What are the requirements for Gilead's cell therapy products under the FDA's Risk
-    Evaluation and Mitigation Strategy program?
-  - What are the four focus areas declared by the company to strengthen their performance
-    going forward?
-  - What triggers the requirement for management to record an impairment charge for
-    investments?
-- source_sentence: The total gross fair value of derivatives was listed as $422,232
-    million as per the latest financial data without adjustments for counterparty
-    netting or collateral.
   sentences:
-  - What was the total gross fair value of derivatives as of December 2023 before
-    netting adjustments in the consolidated financial statements?
-  - How does the company handle the recording and disclosure of contingent liabilities?
-  - What is the significance of reporting financial results on a constant currency
-    basis?
 model-index:
 - name: BGE base Financial Matryoshka
   results:
@@ -86,49 +84,49 @@ model-index:
       type: dim_768
     metrics:
     - type: cosine_accuracy@1
-      value: 0.7071428571428572
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
-      value: 0.8214285714285714
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
-      value: 0.8614285714285714
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
-      value: 0.9042857142857142
       name: Cosine Accuracy@10
     - type: cosine_precision@1
-      value: 0.7071428571428572
       name: Cosine Precision@1
     - type: cosine_precision@3
-      value: 0.2738095238095238
       name: Cosine Precision@3
     - type: cosine_precision@5
-      value: 0.17228571428571426
       name: Cosine Precision@5
     - type: cosine_precision@10
-      value: 0.09042857142857141
       name: Cosine Precision@10
     - type: cosine_recall@1
-      value: 0.7071428571428572
       name: Cosine Recall@1
     - type: cosine_recall@3
-      value: 0.8214285714285714
       name: Cosine Recall@3
     - type: cosine_recall@5
-      value: 0.8614285714285714
       name: Cosine Recall@5
     - type: cosine_recall@10
-      value: 0.9042857142857142
       name: Cosine Recall@10
     - type: cosine_ndcg@10
-      value: 0.8050065074948352
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
-      value: 0.7732902494331064
       name: Cosine Mrr@10
     - type: cosine_map@100
-      value: 0.776990609765374
       name: Cosine Map@100
   - task:
       type: information-retrieval
@@ -138,49 +136,49 @@ model-index:
       type: dim_512
     metrics:
     - type: cosine_accuracy@1
-      value: 0.7014285714285714
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
-      value: 0.8214285714285714
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
-      value: 0.8657142857142858
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
-      value: 0.9057142857142857
       name: Cosine Accuracy@10
     - type: cosine_precision@1
-      value: 0.7014285714285714
       name: Cosine Precision@1
     - type: cosine_precision@3
-      value: 0.2738095238095238
       name: Cosine Precision@3
     - type: cosine_precision@5
-      value: 0.17314285714285713
       name: Cosine Precision@5
     - type: cosine_precision@10
-      value: 0.09057142857142855
       name: Cosine Precision@10
     - type: cosine_recall@1
-      value: 0.7014285714285714
       name: Cosine Recall@1
     - type: cosine_recall@3
-      value: 0.8214285714285714
       name: Cosine Recall@3
     - type: cosine_recall@5
-      value: 0.8657142857142858
       name: Cosine Recall@5
     - type: cosine_recall@10
-      value: 0.9057142857142857
       name: Cosine Recall@10
     - type: cosine_ndcg@10
-      value: 0.8035496957871646
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
-      value: 0.7707964852607707
       name: Cosine Mrr@10
     - type: cosine_map@100
-      value: 0.7744696266512991
       name: Cosine Map@100
   - task:
       type: information-retrieval
@@ -190,49 +188,49 @@ model-index:
       type: dim_256
     metrics:
     - type: cosine_accuracy@1
-      value: 0.6885714285714286
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
-      value: 0.8157142857142857
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
-      value: 0.86
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
-      value: 0.9014285714285715
       name: Cosine Accuracy@10
     - type: cosine_precision@1
-      value: 0.6885714285714286
       name: Cosine Precision@1
     - type: cosine_precision@3
-      value: 0.27190476190476187
       name: Cosine Precision@3
     - type: cosine_precision@5
-      value: 0.172
       name: Cosine Precision@5
     - type: cosine_precision@10
-      value: 0.09014285714285714
       name: Cosine Precision@10
     - type: cosine_recall@1
-      value: 0.6885714285714286
       name: Cosine Recall@1
     - type: cosine_recall@3
-      value: 0.8157142857142857
       name: Cosine Recall@3
     - type: cosine_recall@5
-      value: 0.86
       name: Cosine Recall@5
     - type: cosine_recall@10
-      value: 0.9014285714285715
       name: Cosine Recall@10
     - type: cosine_ndcg@10
-      value: 0.7959304086509564
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
-      value: 0.7620759637188204
       name: Cosine Mrr@10
     - type: cosine_map@100
-      value: 0.7656989001700307
       name: Cosine Map@100
   - task:
       type: information-retrieval
@@ -242,49 +240,49 @@ model-index:
       type: dim_128
     metrics:
     - type: cosine_accuracy@1
-      value: 0.6871428571428572
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
-      value: 0.7871428571428571
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
-      value: 0.8257142857142857
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
-      value: 0.8828571428571429
       name: Cosine Accuracy@10
     - type: cosine_precision@1
-      value: 0.6871428571428572
       name: Cosine Precision@1
     - type: cosine_precision@3
-      value: 0.2623809523809524
       name: Cosine Precision@3
     - type: cosine_precision@5
-      value: 0.16514285714285712
       name: Cosine Precision@5
     - type: cosine_precision@10
-      value: 0.08828571428571427
       name: Cosine Precision@10
     - type: cosine_recall@1
-      value: 0.6871428571428572
       name: Cosine Recall@1
     - type: cosine_recall@3
-      value: 0.7871428571428571
       name: Cosine Recall@3
     - type: cosine_recall@5
-      value: 0.8257142857142857
       name: Cosine Recall@5
     - type: cosine_recall@10
-      value: 0.8828571428571429
       name: Cosine Recall@10
     - type: cosine_ndcg@10
-      value: 0.7805054661054854
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
-      value: 0.7483526077097503
       name: Cosine Mrr@10
     - type: cosine_map@100
-      value: 0.7524860233992903
       name: Cosine Map@100
   - task:
       type: information-retrieval
@@ -294,49 +292,49 @@ model-index:
       type: dim_64
     metrics:
     - type: cosine_accuracy@1
-      value: 0.64
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
-      value: 0.7557142857142857
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
-      value: 0.7828571428571428
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
-      value: 0.8428571428571429
       name: Cosine Accuracy@10
     - type: cosine_precision@1
-      value: 0.64
       name: Cosine Precision@1
     - type: cosine_precision@3
-      value: 0.25190476190476185
       name: Cosine Precision@3
     - type: cosine_precision@5
-      value: 0.15657142857142856
       name: Cosine Precision@5
     - type: cosine_precision@10
-      value: 0.08428571428571427
       name: Cosine Precision@10
     - type: cosine_recall@1
-      value: 0.64
       name: Cosine Recall@1
     - type: cosine_recall@3
-      value: 0.7557142857142857
       name: Cosine Recall@3
     - type: cosine_recall@5
-      value: 0.7828571428571428
       name: Cosine Recall@5
     - type: cosine_recall@10
-      value: 0.8428571428571429
       name: Cosine Recall@10
     - type: cosine_ndcg@10
-      value: 0.7386047605712329
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
-      value: 0.7057772108843535
       name: Cosine Mrr@10
     - type: cosine_map@100
-      value: 0.7112870933540941
       name: Cosine Map@100
 ---
@@ -390,9 +388,9 @@ from sentence_transformers import SentenceTransformer
 model = SentenceTransformer("moritzglnr/bge-base-financial-matryoshka")
 # Run inference
 sentences = [
-    'The total gross fair value of derivatives was listed as $422,232 million as per the latest financial data without adjustments for counterparty netting or collateral.',
-    'What was the total gross fair value of derivatives as of December 2023 before netting adjustments in the consolidated financial statements?',
-    'How does the company handle the recording and disclosure of contingent liabilities?',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
@@ -436,67 +434,67 @@ You can finetune this model on your own dataset.
 * Dataset: `dim_768`
 * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
-| Metric              | Value     |
-|:--------------------|:----------|
-| cosine_accuracy@1   | 0.7071    |
-| cosine_accuracy@3   | 0.8214    |
-| cosine_accuracy@5   | 0.8614    |
-| cosine_accuracy@10  | 0.9043    |
-| cosine_precision@1  | 0.7071    |
-| cosine_precision@3  | 0.2738    |
-| cosine_precision@5  | 0.1723    |
-| cosine_precision@10 | 0.0904    |
-| cosine_recall@1     | 0.7071    |
-| cosine_recall@3     | 0.8214    |
-| cosine_recall@5     | 0.8614    |
-| cosine_recall@10    | 0.9043    |
-| cosine_ndcg@10      | 0.805     |
-| cosine_mrr@10       | 0.7733    |
-| **cosine_map@100**  | **0.777** |
-#### Information Retrieval
-* Dataset: `dim_512`
-* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
 | Metric              | Value      |
 |:--------------------|:-----------|
-| cosine_accuracy@1   | 0.7014     |
-| cosine_accuracy@3   | 0.8214     |
-| cosine_accuracy@5   | 0.8657     |
 | cosine_accuracy@10  | 0.9057     |
-| cosine_precision@1  | 0.7014     |
-| cosine_precision@3  | 0.2738     |
-| cosine_precision@5  | 0.1731     |
 | cosine_precision@10 | 0.0906     |
-| cosine_recall@1     | 0.7014     |
-| cosine_recall@3     | 0.8214     |
-| cosine_recall@5     | 0.8657     |
 | cosine_recall@10    | 0.9057     |
-| cosine_ndcg@10      | 0.8035     |
-| cosine_mrr@10       | 0.7708     |
-| **cosine_map@100**  | **0.7745** |
 #### Information Retrieval
-* Dataset: `dim_256`
 * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
 | Metric              | Value      |
 |:--------------------|:-----------|
-| cosine_accuracy@1   | 0.6886     |
 | cosine_accuracy@3   | 0.8157     |
-| cosine_accuracy@5   | 0.86       |
-| cosine_accuracy@10  | 0.9014     |
-| cosine_precision@1  | 0.6886     |
 | cosine_precision@3  | 0.2719     |
-| cosine_precision@5  | 0.172      |
-| cosine_precision@10 | 0.0901     |
-| cosine_recall@1     | 0.6886     |
 | cosine_recall@3     | 0.8157     |
-| cosine_recall@5     | 0.86       |
-| cosine_recall@10    | 0.9014     |
-| cosine_ndcg@10      | 0.7959     |
-| cosine_mrr@10       | 0.7621     |
-| **cosine_map@100**  | **0.7657** |
 #### Information Retrieval
 * Dataset: `dim_128`
@@ -504,21 +502,21 @@ You can finetune this model on your own dataset.
 | Metric              | Value      |
 |:--------------------|:-----------|
-| cosine_accuracy@1   | 0.6871     |
-| cosine_accuracy@3   | 0.7871     |
-| cosine_accuracy@5   | 0.8257     |
-| cosine_accuracy@10  | 0.8829     |
-| cosine_precision@1  | 0.6871     |
-| cosine_precision@3  | 0.2624     |
-| cosine_precision@5  | 0.1651     |
-| cosine_precision@10 | 0.0883     |
-| cosine_recall@1     | 0.6871     |
-| cosine_recall@3     | 0.7871     |
-| cosine_recall@5     | 0.8257     |
-| cosine_recall@10    | 0.8829     |
-| cosine_ndcg@10      | 0.7805     |
-| cosine_mrr@10       | 0.7484     |
-| **cosine_map@100**  | **0.7525** |
 #### Information Retrieval
 * Dataset: `dim_64`
@@ -526,21 +524,21 @@ You can finetune this model on your own dataset.
 | Metric              | Value      |
 |:--------------------|:-----------|
-| cosine_accuracy@1   | 0.64       |
-| cosine_accuracy@3   | 0.7557     |
-| cosine_accuracy@5   | 0.7829     |
-| cosine_accuracy@10  | 0.8429     |
-| cosine_precision@1  | 0.64       |
-| cosine_precision@3  | 0.2519     |
-| cosine_precision@5  | 0.1566     |
-| cosine_precision@10 | 0.0843     |
-| cosine_recall@1     | 0.64       |
-| cosine_recall@3     | 0.7557     |
-| cosine_recall@5     | 0.7829     |
-| cosine_recall@10    | 0.8429     |
-| cosine_ndcg@10      | 0.7386     |
-| cosine_mrr@10       | 0.7058     |
-| **cosine_map@100**  | **0.7113** |
 <!--
 ## Bias, Risks and Limitations
@@ -567,13 +565,13 @@ You can finetune this model on your own dataset.
   |         | positive                                                                           | anchor                                                                            |
   |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
   | type    | string                                                                             | string                                                                            |
-  | details | <ul><li>min: 2 tokens</li><li>mean: 45.41 tokens</li><li>max: 371 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 20.32 tokens</li><li>max: 51 tokens</li></ul> |
 * Samples:
-  | positive                                                                                                                                                                                                                                         | anchor                                                                                                              |
-  |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------|
-  | <code>The 2023 Form 10-K for Delta Air Lines, Inc. includes various types of financial statements such as consolidated balance sheets, consolidated statements of operations, comprehensive income, cash flows, and stockholders' equity.</code> | <code>What are the primary types of financial statements included in Delta Air Lines, Inc.'s 2023 Form 10-K?</code> |
-  | <code>Critical accounting estimates are those that involve a significant level of estimation uncertainty and have had or are reasonably likely to have a material impact on HP's financial condition or results of operations.</code>            | <code>What factors influence HP's critical accounting estimates?</code>                                             |
-  | <code>The requisite service period for both employee stock options and RSUs is generally four years from the grant date.</code>                                                                                                                  | <code>What is the recognition period for Etsy's stock options and RSUs granted to employees?</code>                 |
 * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
   ```json
   {
@@ -599,15 +597,18 @@ You can finetune this model on your own dataset.
 ### Training Hyperparameters
 #### Non-Default Hyperparameters
 - `per_device_train_batch_size`: 32
 - `per_device_eval_batch_size`: 16
 - `gradient_accumulation_steps`: 16
 - `learning_rate`: 2e-05
-- `num_train_epochs`: 1
 - `lr_scheduler_type`: cosine
 - `warmup_ratio`: 0.1
-- `tf32`: False
 - `load_best_model_at_end`: True
 - `batch_sampler`: no_duplicates
 #### All Hyperparameters
@@ -615,6 +616,7 @@ You can finetune this model on your own dataset.
 - `overwrite_output_dir`: False
 - `do_predict`: False
 - `prediction_loss_only`: True
 - `per_device_train_batch_size`: 32
 - `per_device_eval_batch_size`: 16
@@ -628,7 +630,7 @@ You can finetune this model on your own dataset.
 - `adam_beta2`: 0.999
 - `adam_epsilon`: 1e-08
 - `max_grad_norm`: 1.0
-- `num_train_epochs`: 1
 - `max_steps`: -1
 - `lr_scheduler_type`: cosine
 - `lr_scheduler_kwargs`: {}
@@ -641,6 +643,7 @@ You can finetune this model on your own dataset.
 - `save_safetensors`: True
 - `save_on_each_node`: False
 - `save_only_model`: False
 - `no_cuda`: False
 - `use_cpu`: False
 - `use_mps_device`: False
@@ -648,13 +651,13 @@ You can finetune this model on your own dataset.
 - `data_seed`: None
 - `jit_mode_eval`: False
 - `use_ipex`: False
-- `bf16`: False
 - `fp16`: False
 - `fp16_opt_level`: O1
 - `half_precision_backend`: auto
 - `bf16_full_eval`: False
 - `fp16_full_eval`: False
-- `tf32`: False
 - `local_rank`: 0
 - `ddp_backend`: None
 - `tpu_num_cores`: None
@@ -673,10 +676,10 @@ You can finetune this model on your own dataset.
 - `fsdp_min_num_params`: 0
 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
 - `fsdp_transformer_layer_cls_to_wrap`: None
-- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None}
 - `deepspeed`: None
 - `label_smoothing_factor`: 0.0
-- `optim`: adamw_torch
 - `optim_args`: None
 - `adafactor`: False
 - `group_by_length`: False
@@ -716,6 +719,7 @@ You can finetune this model on your own dataset.
 - `include_num_input_tokens_seen`: False
 - `neftune_noise_alpha`: None
 - `optim_target_modules`: None
 - `batch_sampler`: no_duplicates
 - `multi_dataset_batch_sampler`: proportional
@@ -724,18 +728,24 @@ You can finetune this model on your own dataset.
 ### Training Logs
 | Epoch      | Step   | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
 |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
-| 0.8122     | 10     | 1.4747        | -                      | -                      | -                      | -                     | -                      |
-| **0.9746** | **12** | **-**         | **0.7525**             | **0.7657**             | **0.7745**             | **0.7113**            | **0.777**              |
 * The bold row denotes the saved checkpoint.
 ### Framework Versions
-- Python: 3.11.9
 - Sentence Transformers: 3.0.1
-- Transformers: 4.40.2
-- PyTorch: 2.3.1
 - Accelerate: 0.32.1
-- Datasets: 2.20.0
 - Tokenizers: 0.19.1
 ## Citation

 - loss:MatryoshkaLoss
 - loss:MultipleNegativesRankingLoss
 widget:
+- source_sentence: The table indicates that 18,000 deferred shares were granted to
+    non-employee directors in fiscal 2020, 15,000 in fiscal 2021, and 19,000 in fiscal
+    2022.
   sentences:
+  - What was the primary reason for the increased audit effort for PCC goodwill and
+    indefinite-lived intangible assets?
+  - How many deferred shares were granted to non-employee directors in fiscal 2020,
+    2021, and 2022?
+  - What was the total intrinsic value of options exercised in fiscal year 2023?
+- source_sentence: In Resource Masking Industries, we expect the profit impact from
+    lower sales volume to be partially offset by favorable price realization.
   sentences:
+  - By what percentage did Electronic Arts' operating income grow in the fiscal year
+    ended March 31, 2023?
+  - What impact is expected on Resource Industries' profit due to lower sales volume?
+  - How are IBM’s 2023 Annual Report to Stockholders' financial statements made a
+    part of Form 10-K?
+- source_sentence: The actuarial gain during the year ended December 31, 2022 was
+    primarily related to increases in the discount rate assumptions.
   sentences:
+  - What was the primary reason for the actuarial gain during the year ended December
+    31, 2022?
+  - How much did Ford's total assets amount to by December 31, 2023?
+  - What was the remaining available amount of the share repurchase authorization
+    as of January 29, 2023?
+- source_sentence: Returned $1.7 billion to shareholders through share repurchases
+    and dividend payments.
   sentences:
+  - What was the carrying amount of investments without readily determinable fair
+    values as of December 31, 2023?
+  - What are the significant inputs to the valuation of Goldman Sachs' unsecured short-
+    and long-term borrowings?
+  - How much did the company return to shareholders through share repurchases and
+    dividend payments in 2022?
+- source_sentence: The remaining amount available for borrowing under the Revolving
+    Credit Facility as of December 31, 2023, was $2,245.2 million.
   sentences:
+  - What was the total amount available for borrowing under the Revolving Credit Facility
+    at Iron Mountain as of December 31, 2023?
+  - What type of information is included in Note 13 of the Annual Report on Form 10-K?
+  - How much did local currency revenue increase in Latin America in 2023 compared
+    to 2022?
 model-index:
 - name: BGE base Financial Matryoshka
   results:
       type: dim_768
     metrics:
     - type: cosine_accuracy@1
+      value: 0.6828571428571428
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
+      value: 0.8242857142857143
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
+      value: 0.8557142857142858
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
+      value: 0.9057142857142857
       name: Cosine Accuracy@10
     - type: cosine_precision@1
+      value: 0.6828571428571428
       name: Cosine Precision@1
     - type: cosine_precision@3
+      value: 0.2747619047619047
       name: Cosine Precision@3
     - type: cosine_precision@5
+      value: 0.17114285714285712
       name: Cosine Precision@5
     - type: cosine_precision@10
+      value: 0.09057142857142855
       name: Cosine Precision@10
     - type: cosine_recall@1
+      value: 0.6828571428571428
       name: Cosine Recall@1
     - type: cosine_recall@3
+      value: 0.8242857142857143
       name: Cosine Recall@3
     - type: cosine_recall@5
+      value: 0.8557142857142858
       name: Cosine Recall@5
     - type: cosine_recall@10
+      value: 0.9057142857142857
       name: Cosine Recall@10
     - type: cosine_ndcg@10
+      value: 0.7963610970343802
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
+      value: 0.7612930839002267
       name: Cosine Mrr@10
     - type: cosine_map@100
+      value: 0.7648513048205645
       name: Cosine Map@100
   - task:
       type: information-retrieval
       type: dim_512
     metrics:
     - type: cosine_accuracy@1
+      value: 0.68
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
+      value: 0.8157142857142857
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
+      value: 0.8542857142857143
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
+      value: 0.9
       name: Cosine Accuracy@10
     - type: cosine_precision@1
+      value: 0.68
       name: Cosine Precision@1
     - type: cosine_precision@3
+      value: 0.27190476190476187
       name: Cosine Precision@3
     - type: cosine_precision@5
+      value: 0.17085714285714285
       name: Cosine Precision@5
     - type: cosine_precision@10
+      value: 0.09
       name: Cosine Precision@10
     - type: cosine_recall@1
+      value: 0.68
       name: Cosine Recall@1
     - type: cosine_recall@3
+      value: 0.8157142857142857
       name: Cosine Recall@3
     - type: cosine_recall@5
+      value: 0.8542857142857143
       name: Cosine Recall@5
     - type: cosine_recall@10
+      value: 0.9
       name: Cosine Recall@10
     - type: cosine_ndcg@10
+      value: 0.7911616934987842
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
+      value: 0.7562284580498863
       name: Cosine Mrr@10
     - type: cosine_map@100
+      value: 0.760087172570928
       name: Cosine Map@100
   - task:
       type: information-retrieval
       type: dim_256
     metrics:
     - type: cosine_accuracy@1
+      value: 0.68
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
+      value: 0.8114285714285714
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
+      value: 0.8485714285714285
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
+      value: 0.8971428571428571
       name: Cosine Accuracy@10
     - type: cosine_precision@1
+      value: 0.68
       name: Cosine Precision@1
     - type: cosine_precision@3
+      value: 0.2704761904761905
       name: Cosine Precision@3
     - type: cosine_precision@5
+      value: 0.16971428571428568
       name: Cosine Precision@5
     - type: cosine_precision@10
+      value: 0.0897142857142857
       name: Cosine Precision@10
     - type: cosine_recall@1
+      value: 0.68
       name: Cosine Recall@1
     - type: cosine_recall@3
+      value: 0.8114285714285714
       name: Cosine Recall@3
     - type: cosine_recall@5
+      value: 0.8485714285714285
       name: Cosine Recall@5
     - type: cosine_recall@10
+      value: 0.8971428571428571
       name: Cosine Recall@10
     - type: cosine_ndcg@10
+      value: 0.7888581850866868
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
+      value: 0.7542278911564625
       name: Cosine Mrr@10
     - type: cosine_map@100
+      value: 0.7579536807505182
       name: Cosine Map@100
   - task:
       type: information-retrieval
       type: dim_128
     metrics:
     - type: cosine_accuracy@1
+      value: 0.6571428571428571
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
+      value: 0.79
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
+      value: 0.8285714285714286
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
+      value: 0.8857142857142857
       name: Cosine Accuracy@10
     - type: cosine_precision@1
+      value: 0.6571428571428571
       name: Cosine Precision@1
     - type: cosine_precision@3
+      value: 0.2633333333333333
       name: Cosine Precision@3
     - type: cosine_precision@5
+      value: 0.1657142857142857
       name: Cosine Precision@5
     - type: cosine_precision@10
+      value: 0.08857142857142856
       name: Cosine Precision@10
     - type: cosine_recall@1
+      value: 0.6571428571428571
       name: Cosine Recall@1
     - type: cosine_recall@3
+      value: 0.79
       name: Cosine Recall@3
     - type: cosine_recall@5
+      value: 0.8285714285714286
       name: Cosine Recall@5
     - type: cosine_recall@10
+      value: 0.8857142857142857
       name: Cosine Recall@10
     - type: cosine_ndcg@10
+      value: 0.7703812626851927
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
+      value: 0.733632653061224
       name: Cosine Mrr@10
     - type: cosine_map@100
+      value: 0.7378840513025602
       name: Cosine Map@100
   - task:
       type: information-retrieval
       type: dim_64
     metrics:
     - type: cosine_accuracy@1
+      value: 0.62
       name: Cosine Accuracy@1
     - type: cosine_accuracy@3
+      value: 0.77
       name: Cosine Accuracy@3
     - type: cosine_accuracy@5
+      value: 0.8028571428571428
       name: Cosine Accuracy@5
     - type: cosine_accuracy@10
+      value: 0.85
       name: Cosine Accuracy@10
     - type: cosine_precision@1
+      value: 0.62
       name: Cosine Precision@1
     - type: cosine_precision@3
+      value: 0.25666666666666665
       name: Cosine Precision@3
     - type: cosine_precision@5
+      value: 0.16057142857142856
       name: Cosine Precision@5
     - type: cosine_precision@10
+      value: 0.085
       name: Cosine Precision@10
     - type: cosine_recall@1
+      value: 0.62
       name: Cosine Recall@1
     - type: cosine_recall@3
+      value: 0.77
       name: Cosine Recall@3
     - type: cosine_recall@5
+      value: 0.8028571428571428
       name: Cosine Recall@5
     - type: cosine_recall@10
+      value: 0.85
       name: Cosine Recall@10
     - type: cosine_ndcg@10
+      value: 0.73777886683529
       name: Cosine Ndcg@10
     - type: cosine_mrr@10
+      value: 0.7016190476190474
       name: Cosine Mrr@10
     - type: cosine_map@100
+      value: 0.7073607864232172
       name: Cosine Map@100
 ---
 model = SentenceTransformer("moritzglnr/bge-base-financial-matryoshka")
 # Run inference
 sentences = [
+    'The remaining amount available for borrowing under the Revolving Credit Facility as of December 31, 2023, was $2,245.2 million.',
+    'What was the total amount available for borrowing under the Revolving Credit Facility at Iron Mountain as of December 31, 2023?',
+    'What type of information is included in Note 13 of the Annual Report on Form 10-K?',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
 * Dataset: `dim_768`
 * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
 | Metric              | Value      |
 |:--------------------|:-----------|
+| cosine_accuracy@1   | 0.6829     |
+| cosine_accuracy@3   | 0.8243     |
+| cosine_accuracy@5   | 0.8557     |
 | cosine_accuracy@10  | 0.9057     |
+| cosine_precision@1  | 0.6829     |
+| cosine_precision@3  | 0.2748     |
+| cosine_precision@5  | 0.1711     |
 | cosine_precision@10 | 0.0906     |
+| cosine_recall@1     | 0.6829     |
+| cosine_recall@3     | 0.8243     |
+| cosine_recall@5     | 0.8557     |
 | cosine_recall@10    | 0.9057     |
+| cosine_ndcg@10      | 0.7964     |
+| cosine_mrr@10       | 0.7613     |
+| **cosine_map@100**  | **0.7649** |
 #### Information Retrieval
+* Dataset: `dim_512`
 * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
 | Metric              | Value      |
 |:--------------------|:-----------|
+| cosine_accuracy@1   | 0.68       |
 | cosine_accuracy@3   | 0.8157     |
+| cosine_accuracy@5   | 0.8543     |
+| cosine_accuracy@10  | 0.9        |
+| cosine_precision@1  | 0.68       |
 | cosine_precision@3  | 0.2719     |
+| cosine_precision@5  | 0.1709     |
+| cosine_precision@10 | 0.09       |
+| cosine_recall@1     | 0.68       |
 | cosine_recall@3     | 0.8157     |
+| cosine_recall@5     | 0.8543     |
+| cosine_recall@10    | 0.9        |
+| cosine_ndcg@10      | 0.7912     |
+| cosine_mrr@10       | 0.7562     |
+| **cosine_map@100**  | **0.7601** |
+#### Information Retrieval
+* Dataset: `dim_256`
+* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
+| Metric              | Value     |
+|:--------------------|:----------|
+| cosine_accuracy@1   | 0.68      |
+| cosine_accuracy@3   | 0.8114    |
+| cosine_accuracy@5   | 0.8486    |
+| cosine_accuracy@10  | 0.8971    |
+| cosine_precision@1  | 0.68      |
+| cosine_precision@3  | 0.2705    |
+| cosine_precision@5  | 0.1697    |
+| cosine_precision@10 | 0.0897    |
+| cosine_recall@1     | 0.68      |
+| cosine_recall@3     | 0.8114    |
+| cosine_recall@5     | 0.8486    |
+| cosine_recall@10    | 0.8971    |
+| cosine_ndcg@10      | 0.7889    |
+| cosine_mrr@10       | 0.7542    |
+| **cosine_map@100**  | **0.758** |
 #### Information Retrieval
 * Dataset: `dim_128`
 | Metric              | Value      |
 |:--------------------|:-----------|
+| cosine_accuracy@1   | 0.6571     |
+| cosine_accuracy@3   | 0.79       |
+| cosine_accuracy@5   | 0.8286     |
+| cosine_accuracy@10  | 0.8857     |
+| cosine_precision@1  | 0.6571     |
+| cosine_precision@3  | 0.2633     |
+| cosine_precision@5  | 0.1657     |
+| cosine_precision@10 | 0.0886     |
+| cosine_recall@1     | 0.6571     |
+| cosine_recall@3     | 0.79       |
+| cosine_recall@5     | 0.8286     |
+| cosine_recall@10    | 0.8857     |
+| cosine_ndcg@10      | 0.7704     |
+| cosine_mrr@10       | 0.7336     |
+| **cosine_map@100**  | **0.7379** |
 #### Information Retrieval
 * Dataset: `dim_64`
 | Metric              | Value      |
 |:--------------------|:-----------|
+| cosine_accuracy@1   | 0.62       |
+| cosine_accuracy@3   | 0.77       |
+| cosine_accuracy@5   | 0.8029     |
+| cosine_accuracy@10  | 0.85       |
+| cosine_precision@1  | 0.62       |
+| cosine_precision@3  | 0.2567     |
+| cosine_precision@5  | 0.1606     |
+| cosine_precision@10 | 0.085      |
+| cosine_recall@1     | 0.62       |
+| cosine_recall@3     | 0.77       |
+| cosine_recall@5     | 0.8029     |
+| cosine_recall@10    | 0.85       |
+| cosine_ndcg@10      | 0.7378     |
+| cosine_mrr@10       | 0.7016     |
+| **cosine_map@100**  | **0.7074** |
 <!--
 ## Bias, Risks and Limitations
   |         | positive                                                                           | anchor                                                                            |
   |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
   | type    | string                                                                             | string                                                                            |
+  | details | <ul><li>min: 2 tokens</li><li>mean: 46.27 tokens</li><li>max: 326 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 20.87 tokens</li><li>max: 51 tokens</li></ul> |
 * Samples:
+  | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | anchor                                                                                                                                       |
+  |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------|
+  | <code>We utilize a full yield curve approach in the estimation of service and interest costs by applying the specific spot rates along the yield curve used in the determination of the benefit obligation to the relevant projected cash flows. This approach provides a more precise measurement of service and interest costs by improving the correlation between the projected cash flows to the corresponding spot rates along the yield curve. This approach does not affect the measurement of our pension and other post-retirement benefit liabilities but generally results in lower benefit expense in periods when the yield curve is upward sloping.</code> | <code>How does the use of a full yield curve approach in estimating pension costs affect the measurement of liabilities and expenses?</code> |
+  | <code>Ending | 8,134 | | 8,206 | | 16,340 | | 8,061 | | 8,016 | 16,077</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | <code>What was the ending store count for the Family Dollar segment after the fiscal year ended January 28, 2023?</code>                     |
+  | <code>The company's capital expenditures for 2024 are expected to be approximately $5.7 billion.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | <code>How much does the company expect to spend on capital expenditures in 2024?</code>                                                      |
 * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
   ```json
   {
 ### Training Hyperparameters
 #### Non-Default Hyperparameters
+- `eval_strategy`: epoch
 - `per_device_train_batch_size`: 32
 - `per_device_eval_batch_size`: 16
 - `gradient_accumulation_steps`: 16
 - `learning_rate`: 2e-05
+- `num_train_epochs`: 4
 - `lr_scheduler_type`: cosine
 - `warmup_ratio`: 0.1
+- `bf16`: True
+- `tf32`: True
 - `load_best_model_at_end`: True
+- `optim`: adamw_torch_fused
 - `batch_sampler`: no_duplicates
 #### All Hyperparameters
 - `overwrite_output_dir`: False
 - `do_predict`: False
+- `eval_strategy`: epoch
 - `prediction_loss_only`: True
 - `per_device_train_batch_size`: 32
 - `per_device_eval_batch_size`: 16
 - `adam_beta2`: 0.999
 - `adam_epsilon`: 1e-08
 - `max_grad_norm`: 1.0
+- `num_train_epochs`: 4
 - `max_steps`: -1
 - `lr_scheduler_type`: cosine
 - `lr_scheduler_kwargs`: {}
 - `save_safetensors`: True
 - `save_on_each_node`: False
 - `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
 - `no_cuda`: False
 - `use_cpu`: False
 - `use_mps_device`: False
 - `data_seed`: None
 - `jit_mode_eval`: False
 - `use_ipex`: False
+- `bf16`: True
 - `fp16`: False
 - `fp16_opt_level`: O1
 - `half_precision_backend`: auto
 - `bf16_full_eval`: False
 - `fp16_full_eval`: False
+- `tf32`: True
 - `local_rank`: 0
 - `ddp_backend`: None
 - `tpu_num_cores`: None
 - `fsdp_min_num_params`: 0
 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
 - `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
 - `deepspeed`: None
 - `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch_fused
 - `optim_args`: None
 - `adafactor`: False
 - `group_by_length`: False
 - `include_num_input_tokens_seen`: False
 - `neftune_noise_alpha`: None
 - `optim_target_modules`: None
+- `batch_eval_metrics`: False
 - `batch_sampler`: no_duplicates
 - `multi_dataset_batch_sampler`: proportional
 ### Training Logs
 | Epoch      | Step   | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
 |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
+| 0.8122     | 10     | 1.5661        | -                      | -                      | -                      | -                     | -                      |
+| 0.9746     | 12     | -             | 0.7151                 | 0.7378                 | 0.7443                 | 0.6680                | 0.7546                 |
+| 1.6244     | 20     | 0.6602        | -                      | -                      | -                      | -                     | -                      |
+| 1.9492     | 24     | -             | 0.7326                 | 0.7533                 | 0.7564                 | 0.7037                | 0.7640                 |
+| 2.4365     | 30     | 0.4675        | -                      | -                      | -                      | -                     | -                      |
+| 2.9239     | 36     | -             | 0.7384                 | 0.7575                 | 0.7601                 | 0.7086                | 0.7643                 |
+| 3.2487     | 40     | 0.3891        | -                      | -                      | -                      | -                     | -                      |
+| **3.8985** | **48** | **-**         | **0.7379**             | **0.758**              | **0.7601**             | **0.7074**            | **0.7649**             |
 * The bold row denotes the saved checkpoint.
 ### Framework Versions
+- Python: 3.10.12
 - Sentence Transformers: 3.0.1
+- Transformers: 4.41.2
+- PyTorch: 2.1.2+cu121
 - Accelerate: 0.32.1
+- Datasets: 2.19.1
 - Tokenizers: 0.19.1
 ## Citation

config.json CHANGED Viewed

@@ -25,7 +25,7 @@
   "pad_token_id": 0,
   "position_embedding_type": "absolute",
   "torch_dtype": "float32",
-  "transformers_version": "4.40.2",
   "type_vocab_size": 2,
   "use_cache": true,
   "vocab_size": 30522

   "pad_token_id": 0,
   "position_embedding_type": "absolute",
   "torch_dtype": "float32",
+  "transformers_version": "4.41.2",
   "type_vocab_size": 2,
   "use_cache": true,
   "vocab_size": 30522

config_sentence_transformers.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "__version__": {
     "sentence_transformers": "3.0.1",
-    "transformers": "4.40.2",
-    "pytorch": "2.3.1"
   },
   "prompts": {},
   "default_prompt_name": null,

 {
   "__version__": {
     "sentence_transformers": "3.0.1",
+    "transformers": "4.41.2",
+    "pytorch": "2.1.2+cu121"
   },
   "prompts": {},
   "default_prompt_name": null,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:dca649d3679e0f7725f2ce71425b5e4713878f8a7e96ba37c3290b3560c1ce62
 size 437951328

 version https://git-lfs.github.com/spec/v1
+oid sha256:3a4b3599c8539611169e8b7acd33e1e01633e1ff2df151e68a9df530a8550c09
 size 437951328