SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2 on the allstats-semantic-search-synthetic-dataset-v1 dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yahyaabd/allstats-semantic-search-model-v1-3")
# Run inference
sentences = [
    'perubahan nilai tukar petani bulan mei 2017',
    'Perkembangan Nilai Tukar Petani Mei 2017',
    'Statistik Restoran/Rumah Makan Tahun 2014',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric allstats-semantic-search-v1-3-dev allstat-semantic-search-v1-3-test
pearson_cosine 0.9956 0.9955
spearman_cosine 0.9588 0.9583

Training Details

Training Dataset

allstats-semantic-search-synthetic-dataset-v1

  • Dataset: allstats-semantic-search-synthetic-dataset-v1 at b13c0a7
  • Size: 212,940 training samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string float
    details
    • min: 5 tokens
    • mean: 11.46 tokens
    • max: 34 tokens
    • min: 5 tokens
    • mean: 14.47 tokens
    • max: 54 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.05
  • Samples:
    query doc label
    aDta industri besar dan sedang Indonesia 2008 Statistik Industri Besar dan Sedang Indonesia 2008 0.9
    profil bisnis konstruksi individu jawa barat 2022 Statistik Industri Manufaktur Indonesia 2015 - Bahan Baku 0.15
    data statistik ekonomi indonesia Nilai Tukar Valuta Asing di Indonesia 2014 0.08
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

allstats-semantic-search-synthetic-dataset-v1

  • Dataset: allstats-semantic-search-synthetic-dataset-v1 at b13c0a7
  • Size: 26,618 evaluation samples
  • Columns: query, doc, and label
  • Approximate statistics based on the first 1000 samples:
    query doc label
    type string string float
    details
    • min: 5 tokens
    • mean: 11.38 tokens
    • max: 34 tokens
    • min: 4 tokens
    • mean: 14.63 tokens
    • max: 55 tokens
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    query doc label
    tahun berapa ekspor naik 2,37% dan impor naik 30,30%? Bulan November 2006 Ekspor Naik 2,37 % dan Impor Naik 30,30 % 1.0
    Berapa produksi padi pada tahun 2023? Produksi padi tahun lainnya 0.0
    data statistik solus per aqua 2015 Statistik Solus Per Aqua (SPA) 2015 0.97
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 12
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 12
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss allstats-semantic-search-v1-3-dev_spearman_cosine allstat-semantic-search-v1-3-test_spearman_cosine
0.0751 500 0.0653 0.0400 0.7035 -
0.1503 1000 0.0361 0.0296 0.7310 -
0.2254 1500 0.0278 0.0226 0.7669 -
0.3005 2000 0.0226 0.0195 0.7748 -
0.3757 2500 0.0208 0.0183 0.7769 -
0.4508 3000 0.0184 0.0172 0.7994 -
0.5259 3500 0.0179 0.0159 0.7931 -
0.6011 4000 0.0159 0.0155 0.7966 -
0.6762 4500 0.0161 0.0150 0.8047 -
0.7513 5000 0.0163 0.0153 0.7910 -
0.8264 5500 0.0158 0.0155 0.7956 -
0.9016 6000 0.0149 0.0141 0.8148 -
0.9767 6500 0.0149 0.0145 0.8287 -
1.0518 7000 0.0148 0.0150 0.7933 -
1.1270 7500 0.0131 0.0136 0.8083 -
1.2021 8000 0.0124 0.0131 0.8173 -
1.2772 8500 0.0133 0.0130 0.8117 -
1.3524 9000 0.012 0.0126 0.8259 -
1.4275 9500 0.0119 0.0120 0.8178 -
1.5026 10000 0.0116 0.0118 0.8332 -
1.5778 10500 0.0132 0.0123 0.8108 -
1.6529 11000 0.0114 0.0111 0.8365 -
1.7280 11500 0.0105 0.0109 0.8235 -
1.8032 12000 0.0107 0.0105 0.8445 -
1.8783 12500 0.0106 0.0101 0.8330 -
1.9534 13000 0.0095 0.0096 0.8437 -
2.0285 13500 0.0093 0.0094 0.8417 -
2.1037 14000 0.0079 0.0093 0.8485 -
2.1788 14500 0.008 0.0089 0.8422 -
2.2539 15000 0.0081 0.0086 0.8485 -
2.3291 15500 0.008 0.0084 0.8530 -
2.4042 16000 0.007 0.0084 0.8597 -
2.4793 16500 0.0081 0.0087 0.8499 -
2.5545 17000 0.0078 0.0078 0.8577 -
2.6296 17500 0.007 0.0080 0.8559 -
2.7047 18000 0.0072 0.0078 0.8569 -
2.7799 18500 0.0069 0.0079 0.8579 -
2.8550 19000 0.0064 0.0072 0.8693 -
2.9301 19500 0.0064 0.0070 0.8747 -
3.0053 20000 0.0061 0.0068 0.8757 -
3.0804 20500 0.0052 0.0069 0.8727 -
3.1555 21000 0.005 0.0067 0.8734 -
3.2307 21500 0.0054 0.0065 0.8727 -
3.3058 22000 0.0058 0.0070 0.8715 -
3.3809 22500 0.0056 0.0066 0.8724 -
3.4560 23000 0.0056 0.0070 0.8740 -
3.5312 23500 0.0054 0.0060 0.8775 -
3.6063 24000 0.0051 0.0062 0.8746 -
3.6814 24500 0.0047 0.0060 0.8765 -
3.7566 25000 0.0051 0.0067 0.8783 -
3.8317 25500 0.0048 0.0058 0.8824 -
3.9068 26000 0.0048 0.0059 0.8862 -
3.9820 26500 0.005 0.0056 0.8853 -
4.0571 27000 0.0042 0.0053 0.8868 -
4.1322 27500 0.0036 0.0056 0.8893 -
4.2074 28000 0.0041 0.0052 0.8954 -
4.2825 28500 0.0041 0.0050 0.8943 -
4.3576 29000 0.0036 0.0050 0.8890 -
4.4328 29500 0.0036 0.0046 0.8990 -
4.5079 30000 0.0038 0.0051 0.8934 -
4.5830 30500 0.0037 0.0049 0.9011 -
4.6582 31000 0.0036 0.0049 0.9000 -
4.7333 31500 0.0041 0.0052 0.8938 -
4.8084 32000 0.004 0.0049 0.8971 -
4.8835 32500 0.0038 0.0043 0.9023 -
4.9587 33000 0.0036 0.0044 0.9006 -
5.0338 33500 0.0032 0.0043 0.9042 -
5.1089 34000 0.0031 0.0042 0.9054 -
5.1841 34500 0.0028 0.0042 0.9052 -
5.2592 35000 0.0028 0.0043 0.9065 -
5.3343 35500 0.003 0.0041 0.9093 -
5.4095 36000 0.0029 0.0042 0.9084 -
5.4846 36500 0.0029 0.0044 0.9078 -
5.5597 37000 0.0027 0.0043 0.9062 -
5.6349 37500 0.003 0.0039 0.9101 -
5.7100 38000 0.0027 0.0041 0.9092 -
5.7851 38500 0.0025 0.0039 0.9140 -
5.8603 39000 0.0027 0.0037 0.9138 -
5.9354 39500 0.0027 0.0037 0.9137 -
6.0105 40000 0.0027 0.0036 0.9162 -
6.0856 40500 0.002 0.0035 0.9209 -
6.1608 41000 0.0021 0.0037 0.9180 -
6.2359 41500 0.0023 0.0036 0.9183 -
6.3110 42000 0.0024 0.0035 0.9218 -
6.3862 42500 0.002 0.0033 0.9216 -
6.4613 43000 0.0024 0.0035 0.9220 -
6.5364 43500 0.0018 0.0034 0.9232 -
6.6116 44000 0.0021 0.0033 0.9236 -
6.6867 44500 0.0021 0.0035 0.9225 -
6.7618 45000 0.0027 0.0031 0.9227 -
6.8370 45500 0.0019 0.0032 0.9242 -
6.9121 46000 0.0022 0.0033 0.9224 -
6.9872 46500 0.0022 0.0030 0.9252 -
7.0624 47000 0.0017 0.0029 0.9294 -
7.1375 47500 0.0014 0.0028 0.9304 -
7.2126 48000 0.0015 0.0028 0.9324 -
7.2878 48500 0.0014 0.0030 0.9313 -
7.3629 49000 0.0015 0.0029 0.9333 -
7.4380 49500 0.0015 0.0028 0.9342 -
7.5131 50000 0.0018 0.0030 0.9261 -
7.5883 50500 0.0016 0.0030 0.9329 -
7.6634 51000 0.0019 0.0026 0.9334 -
7.7385 51500 0.0018 0.0029 0.9336 -
7.8137 52000 0.0016 0.0026 0.9353 -
7.8888 52500 0.0016 0.0026 0.9351 -
7.9639 53000 0.0017 0.0024 0.9356 -
8.0391 53500 0.0013 0.0023 0.9396 -
8.1142 54000 0.0012 0.0024 0.9390 -
8.1893 54500 0.001 0.0024 0.9421 -
8.2645 55000 0.0012 0.0024 0.9406 -
8.3396 55500 0.0012 0.0023 0.9407 -
8.4147 56000 0.0012 0.0024 0.9398 -
8.4899 56500 0.0012 0.0024 0.9412 -
8.5650 57000 0.0014 0.0024 0.9397 -
8.6401 57500 0.0013 0.0023 0.9411 -
8.7153 58000 0.0013 0.0023 0.9418 -
8.7904 58500 0.0014 0.0022 0.9432 -
8.8655 59000 0.0011 0.0022 0.9448 -
8.9406 59500 0.0012 0.0022 0.9455 -
9.0158 60000 0.0012 0.0021 0.9453 -
9.0909 60500 0.0009 0.0021 0.9461 -
9.1660 61000 0.0009 0.0021 0.9465 -
9.2412 61500 0.0009 0.0021 0.9471 -
9.3163 62000 0.0009 0.0021 0.9477 -
9.3914 62500 0.0008 0.0020 0.9482 -
9.4666 63000 0.0012 0.0020 0.9478 -
9.5417 63500 0.0009 0.0020 0.9479 -
9.6168 64000 0.0009 0.0020 0.9485 -
9.6920 64500 0.0011 0.0020 0.9492 -
9.7671 65000 0.0008 0.0019 0.9497 -
9.8422 65500 0.001 0.0019 0.9504 -
9.9174 66000 0.0009 0.0019 0.9518 -
9.9925 66500 0.0009 0.0019 0.9510 -
10.0676 67000 0.0008 0.0018 0.9517 -
10.1427 67500 0.0007 0.0018 0.9524 -
10.2179 68000 0.0007 0.0018 0.9521 -
10.2930 68500 0.0008 0.0019 0.9526 -
10.3681 69000 0.0007 0.0019 0.9529 -
10.4433 69500 0.0008 0.0018 0.9541 -
10.5184 70000 0.0007 0.0017 0.9551 -
10.5935 70500 0.0007 0.0018 0.9550 -
10.6687 71000 0.0008 0.0017 0.9554 -
10.7438 71500 0.0007 0.0017 0.9558 -
10.8189 72000 0.0007 0.0018 0.9558 -
10.8941 72500 0.0007 0.0018 0.9562 -
10.9692 73000 0.0009 0.0017 0.9559 -
11.0443 73500 0.0005 0.0017 0.9571 -
11.1195 74000 0.0006 0.0017 0.9570 -
11.1946 74500 0.0005 0.0017 0.9573 -
11.2697 75000 0.0005 0.0017 0.9574 -
11.3449 75500 0.0006 0.0017 0.9576 -
11.4200 76000 0.0006 0.0017 0.9577 -
11.4951 76500 0.0006 0.0017 0.9577 -
11.5702 77000 0.0005 0.0016 0.9582 -
11.6454 77500 0.0006 0.0017 0.9583 -
11.7205 78000 0.0005 0.0016 0.9584 -
11.7956 78500 0.0005 0.0016 0.9587 -
11.8708 79000 0.0005 0.0016 0.9588 -
11.9459 79500 0.0005 0.0016 0.9588 -
12.0 79860 - - - 0.9583

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.2.2+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
2
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yahyaabd/allstats-semantic-search-model-v1-3-origin

Dataset used to train yahyaabd/allstats-semantic-search-model-v1-3-origin

Evaluation results

  • Pearson Cosine on allstats semantic search v1 3 dev
    self-reported
    0.996
  • Spearman Cosine on allstats semantic search v1 3 dev
    self-reported
    0.959
  • Pearson Cosine on allstat semantic search v1 3 test
    self-reported
    0.996
  • Spearman Cosine on allstat semantic search v1 3 test
    self-reported
    0.958