SentenceTransformer based on x2bee/ModernBERT-SimCSE-multitask_v03

This is a sentence-transformers model finetuned from x2bee/ModernBERT-SimCSE-multitask_v03 on the misc_sts_pairs_v2_kor_kosimcse dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("x2bee/ModernBERT-SimCSE-multitask_v03-beta")
# Run inference
sentences = [
    '버스가 바쁜 길을 따라 운전한다.',
    '녹색 버스가 도로를 따라 내려간다.',
    '그 여자는 데이트하러 가는 중이다.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8352
spearman_cosine 0.8406
pearson_euclidean 0.8257
spearman_euclidean 0.8336
pearson_manhattan 0.8261
spearman_manhattan 0.8341
pearson_dot 0.7368
spearman_dot 0.7201
pearson_max 0.8352
spearman_max 0.8406

Training Details

Training Dataset

misc_sts_pairs_v2_kor_kosimcse

  • Dataset: misc_sts_pairs_v2_kor_kosimcse at e747415
  • Size: 449,904 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 6 tokens
    • mean: 18.3 tokens
    • max: 69 tokens
    • min: 6 tokens
    • mean: 18.69 tokens
    • max: 66 tokens
    • min: 0.11
    • mean: 0.77
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    주홍글씨는 언제 출판되었습니까? 《주홍글씨》는 몇 년에 출판되었습니까? 0.8638778924942017
    폴란드에서 빨간색과 흰색은 무엇을 의미합니까? 폴란드 국기의 색상은 무엇입니까? 0.6773715019226074
    노르만인들은 방어를 위해 모트와 베일리 성을 어떻게 사용했는가? 11세기에는 어떻게 모트와 베일리 성을 만들었습니까? 0.7460665702819824
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 1,500 evaluation samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 7 tokens
    • mean: 20.38 tokens
    • max: 52 tokens
    • min: 6 tokens
    • mean: 20.52 tokens
    • max: 54 tokens
    • min: 0.0
    • mean: 0.42
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    안전모를 가진 한 남자가 춤을 추고 있다. 안전모를 쓴 한 남자가 춤을 추고 있다. 1.0
    어린아이가 말을 타고 있다. 아이가 말을 타고 있다. 0.95
    한 남자가 뱀에게 쥐를 먹이고 있다. 남자가 뱀에게 쥐를 먹이고 있다. 1.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • overwrite_output_dir: True
  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 8
  • learning_rate: 8e-05
  • num_train_epochs: 2.0
  • warmup_ratio: 0.2
  • push_to_hub: True
  • hub_model_id: x2bee/ModernBERT-SimCSE-multitask_v03-beta
  • hub_strategy: checkpoint
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: True
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 8e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2.0
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: x2bee/ModernBERT-SimCSE-multitask_v03-beta
  • hub_strategy: checkpoint
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts_dev_spearman_max
0.0028 10 0.0216 - -
0.0057 20 0.0204 - -
0.0085 30 0.0194 - -
0.0114 40 0.0195 - -
0.0142 50 0.0182 - -
0.0171 60 0.0161 - -
0.0199 70 0.015 - -
0.0228 80 0.0153 - -
0.0256 90 0.0137 - -
0.0285 100 0.014 - -
0.0313 110 0.0122 - -
0.0341 120 0.0114 - -
0.0370 130 0.0109 - -
0.0398 140 0.0097 - -
0.0427 150 0.0085 - -
0.0455 160 0.0084 - -
0.0484 170 0.0083 - -
0.0512 180 0.0078 - -
0.0541 190 0.008 - -
0.0569 200 0.0073 - -
0.0597 210 0.0079 - -
0.0626 220 0.0073 - -
0.0654 230 0.0079 - -
0.0683 240 0.0068 - -
0.0711 250 0.0068 0.0333 0.8229
0.0740 260 0.0073 - -
0.0768 270 0.0077 - -
0.0797 280 0.0067 - -
0.0825 290 0.007 - -
0.0854 300 0.0065 - -
0.0882 310 0.0072 - -
0.0910 320 0.0068 - -
0.0939 330 0.0064 - -
0.0967 340 0.0074 - -
0.0996 350 0.0071 - -
0.1024 360 0.0065 - -
0.1053 370 0.0067 - -
0.1081 380 0.0063 - -
0.1110 390 0.0062 - -
0.1138 400 0.0068 - -
0.1166 410 0.0064 - -
0.1195 420 0.0064 - -
0.1223 430 0.0064 - -
0.1252 440 0.0074 - -
0.1280 450 0.0069 - -
0.1309 460 0.0065 - -
0.1337 470 0.0067 - -
0.1366 480 0.0068 - -
0.1394 490 0.0057 - -
0.1423 500 0.0065 0.0343 0.8284
0.1451 510 0.0069 - -
0.1479 520 0.0068 - -
0.1508 530 0.0065 - -
0.1536 540 0.0065 - -
0.1565 550 0.0063 - -
0.1593 560 0.0058 - -
0.1622 570 0.0064 - -
0.1650 580 0.0062 - -
0.1679 590 0.0061 - -
0.1707 600 0.0062 - -
0.1735 610 0.0057 - -
0.1764 620 0.0066 - -
0.1792 630 0.0061 - -
0.1821 640 0.0054 - -
0.1849 650 0.0066 - -
0.1878 660 0.0059 - -
0.1906 670 0.0063 - -
0.1935 680 0.0065 - -
0.1963 690 0.0065 - -
0.1992 700 0.0058 - -
0.2020 710 0.006 - -
0.2048 720 0.0062 - -
0.2077 730 0.0058 - -
0.2105 740 0.0058 - -
0.2134 750 0.0056 0.0356 0.8302
0.2162 760 0.0067 - -
0.2191 770 0.0063 - -
0.2219 780 0.0063 - -
0.2248 790 0.0063 - -
0.2276 800 0.0056 - -
0.2304 810 0.0058 - -
0.2333 820 0.0053 - -
0.2361 830 0.0057 - -
0.2390 840 0.0055 - -
0.2418 850 0.0054 - -
0.2447 860 0.0065 - -
0.2475 870 0.0054 - -
0.2504 880 0.0051 - -
0.2532 890 0.0057 - -
0.2561 900 0.0056 - -
0.2589 910 0.0055 - -
0.2617 920 0.0051 - -
0.2646 930 0.0055 - -
0.2674 940 0.0059 - -
0.2703 950 0.005 - -
0.2731 960 0.0058 - -
0.2760 970 0.005 - -
0.2788 980 0.0055 - -
0.2817 990 0.0054 - -
0.2845 1000 0.0055 0.0360 0.8319
0.2874 1010 0.0059 - -
0.2902 1020 0.0049 - -
0.2930 1030 0.0052 - -
0.2959 1040 0.0051 - -
0.2987 1050 0.006 - -
0.3016 1060 0.0048 - -
0.3044 1070 0.0055 - -
0.3073 1080 0.0052 - -
0.3101 1090 0.0051 - -
0.3130 1100 0.0051 - -
0.3158 1110 0.005 - -
0.3186 1120 0.0054 - -
0.3215 1130 0.0051 - -
0.3243 1140 0.0054 - -
0.3272 1150 0.0056 - -
0.3300 1160 0.0053 - -
0.3329 1170 0.0052 - -
0.3357 1180 0.0051 - -
0.3386 1190 0.0051 - -
0.3414 1200 0.0048 - -
0.3443 1210 0.005 - -
0.3471 1220 0.0055 - -
0.3499 1230 0.0049 - -
0.3528 1240 0.0053 - -
0.3556 1250 0.0052 0.0364 0.8330
0.3585 1260 0.0051 - -
0.3613 1270 0.005 - -
0.3642 1280 0.005 - -
0.3670 1290 0.0045 - -
0.3699 1300 0.0055 - -
0.3727 1310 0.0049 - -
0.3755 1320 0.0049 - -
0.3784 1330 0.0053 - -
0.3812 1340 0.005 - -
0.3841 1350 0.0048 - -
0.3869 1360 0.0049 - -
0.3898 1370 0.0046 - -
0.3926 1380 0.0049 - -
0.3955 1390 0.0052 - -
0.3983 1400 0.005 - -
0.4012 1410 0.0052 - -
0.4040 1420 0.0052 - -
0.4068 1430 0.0045 - -
0.4097 1440 0.0046 - -
0.4125 1450 0.0056 - -
0.4154 1460 0.0056 - -
0.4182 1470 0.005 - -
0.4211 1480 0.0051 - -
0.4239 1490 0.0049 - -
0.4268 1500 0.0048 0.0374 0.8334
0.4296 1510 0.0053 - -
0.4324 1520 0.0054 - -
0.4353 1530 0.0048 - -
0.4381 1540 0.005 - -
0.4410 1550 0.0045 - -
0.4438 1560 0.0046 - -
0.4467 1570 0.0045 - -
0.4495 1580 0.0049 - -
0.4524 1590 0.0048 - -
0.4552 1600 0.005 - -
0.4581 1610 0.0045 - -
0.4609 1620 0.0049 - -
0.4637 1630 0.0044 - -
0.4666 1640 0.0048 - -
0.4694 1650 0.0049 - -
0.4723 1660 0.0048 - -
0.4751 1670 0.0051 - -
0.4780 1680 0.0047 - -
0.4808 1690 0.0048 - -
0.4837 1700 0.0047 - -
0.4865 1710 0.0044 - -
0.4893 1720 0.0049 - -
0.4922 1730 0.0049 - -
0.4950 1740 0.0051 - -
0.4979 1750 0.0043 0.0392 0.8352
0.5007 1760 0.0043 - -
0.5036 1770 0.0045 - -
0.5064 1780 0.0046 - -
0.5093 1790 0.0042 - -
0.5121 1800 0.0047 - -
0.5150 1810 0.0047 - -
0.5178 1820 0.0046 - -
0.5206 1830 0.0044 - -
0.5235 1840 0.0046 - -
0.5263 1850 0.0047 - -
0.5292 1860 0.0044 - -
0.5320 1870 0.0047 - -
0.5349 1880 0.0049 - -
0.5377 1890 0.0049 - -
0.5406 1900 0.0047 - -
0.5434 1910 0.0045 - -
0.5462 1920 0.0044 - -
0.5491 1930 0.0048 - -
0.5519 1940 0.0041 - -
0.5548 1950 0.004 - -
0.5576 1960 0.0048 - -
0.5605 1970 0.0042 - -
0.5633 1980 0.0048 - -
0.5662 1990 0.0045 - -
0.5690 2000 0.0043 0.0375 0.8359
0.5719 2010 0.005 - -
0.5747 2020 0.0049 - -
0.5775 2030 0.0044 - -
0.5804 2040 0.0045 - -
0.5832 2050 0.0043 - -
0.5861 2060 0.0045 - -
0.5889 2070 0.004 - -
0.5918 2080 0.0042 - -
0.5946 2090 0.0044 - -
0.5975 2100 0.0043 - -
0.6003 2110 0.0041 - -
0.6032 2120 0.0046 - -
0.6060 2130 0.0048 - -
0.6088 2140 0.0048 - -
0.6117 2150 0.0041 - -
0.6145 2160 0.0044 - -
0.6174 2170 0.0045 - -
0.6202 2180 0.0044 - -
0.6231 2190 0.0044 - -
0.6259 2200 0.0046 - -
0.6288 2210 0.0048 - -
0.6316 2220 0.0045 - -
0.6344 2230 0.004 - -
0.6373 2240 0.0041 - -
0.6401 2250 0.0044 0.0391 0.8369
0.6430 2260 0.0044 - -
0.6458 2270 0.0045 - -
0.6487 2280 0.0041 - -
0.6515 2290 0.0042 - -
0.6544 2300 0.0043 - -
0.6572 2310 0.004 - -
0.6601 2320 0.0042 - -
0.6629 2330 0.0041 - -
0.6657 2340 0.0045 - -
0.6686 2350 0.0045 - -
0.6714 2360 0.0042 - -
0.6743 2370 0.0045 - -
0.6771 2380 0.0044 - -
0.6800 2390 0.0044 - -
0.6828 2400 0.0041 - -
0.6857 2410 0.0045 - -
0.6885 2420 0.0046 - -
0.6913 2430 0.0041 - -
0.6942 2440 0.0048 - -
0.6970 2450 0.0041 - -
0.6999 2460 0.0043 - -
0.7027 2470 0.0043 - -
0.7056 2480 0.0037 - -
0.7084 2490 0.0042 - -
0.7113 2500 0.0043 0.0405 0.8365
0.7141 2510 0.0045 - -
0.7170 2520 0.0044 - -
0.7198 2530 0.0042 - -
0.7226 2540 0.0042 - -
0.7255 2550 0.0041 - -
0.7283 2560 0.0042 - -
0.7312 2570 0.0041 - -
0.7340 2580 0.0042 - -
0.7369 2590 0.0041 - -
0.7397 2600 0.0047 - -
0.7426 2610 0.0038 - -
0.7454 2620 0.0041 - -
0.7482 2630 0.0042 - -
0.7511 2640 0.0042 - -
0.7539 2650 0.0042 - -
0.7568 2660 0.0041 - -
0.7596 2670 0.0042 - -
0.7625 2680 0.0044 - -
0.7653 2690 0.0039 - -
0.7682 2700 0.0037 - -
0.7710 2710 0.0044 - -
0.7739 2720 0.0043 - -
0.7767 2730 0.0042 - -
0.7795 2740 0.0041 - -
0.7824 2750 0.0039 0.0387 0.8376
0.7852 2760 0.0047 - -
0.7881 2770 0.004 - -
0.7909 2780 0.0039 - -
0.7938 2790 0.0039 - -
0.7966 2800 0.0039 - -
0.7995 2810 0.0039 - -
0.8023 2820 0.0039 - -
0.8051 2830 0.0041 - -
0.8080 2840 0.0037 - -
0.8108 2850 0.0044 - -
0.8137 2860 0.0043 - -
0.8165 2870 0.0041 - -
0.8194 2880 0.0043 - -
0.8222 2890 0.0039 - -
0.8251 2900 0.0041 - -
0.8279 2910 0.0044 - -
0.8308 2920 0.004 - -
0.8336 2930 0.0042 - -
0.8364 2940 0.0039 - -
0.8393 2950 0.004 - -
0.8421 2960 0.0042 - -
0.8450 2970 0.004 - -
0.8478 2980 0.0039 - -
0.8507 2990 0.0037 - -
0.8535 3000 0.0039 0.0386 0.8386
0.8564 3010 0.0041 - -
0.8592 3020 0.0043 - -
0.8621 3030 0.0041 - -
0.8649 3040 0.0041 - -
0.8677 3050 0.0043 - -
0.8706 3060 0.0042 - -
0.8734 3070 0.0039 - -
0.8763 3080 0.004 - -
0.8791 3090 0.0039 - -
0.8820 3100 0.0039 - -
0.8848 3110 0.004 - -
0.8877 3120 0.0039 - -
0.8905 3130 0.0038 - -
0.8933 3140 0.0036 - -
0.8962 3150 0.0039 - -
0.8990 3160 0.0039 - -
0.9019 3170 0.0038 - -
0.9047 3180 0.0039 - -
0.9076 3190 0.0041 - -
0.9104 3200 0.004 - -
0.9133 3210 0.0041 - -
0.9161 3220 0.0042 - -
0.9190 3230 0.004 - -
0.9218 3240 0.0041 - -
0.9246 3250 0.0041 0.0420 0.8408
0.9275 3260 0.0041 - -
0.9303 3270 0.004 - -
0.9332 3280 0.0042 - -
0.9360 3290 0.004 - -
0.9389 3300 0.0037 - -
0.9417 3310 0.0038 - -
0.9446 3320 0.0039 - -
0.9474 3330 0.004 - -
0.9502 3340 0.0037 - -
0.9531 3350 0.0038 - -
0.9559 3360 0.0037 - -
0.9588 3370 0.0042 - -
0.9616 3380 0.0042 - -
0.9645 3390 0.0042 - -
0.9673 3400 0.0037 - -
0.9702 3410 0.0038 - -
0.9730 3420 0.0039 - -
0.9759 3430 0.0038 - -
0.9787 3440 0.0041 - -
0.9815 3450 0.004 - -
0.9844 3460 0.0039 - -
0.9872 3470 0.0036 - -
0.9901 3480 0.0037 - -
0.9929 3490 0.0039 - -
0.9958 3500 0.0036 0.0403 0.8396
0.9986 3510 0.0035 - -
1.0014 3520 0.0036 - -
1.0043 3530 0.0035 - -
1.0071 3540 0.0036 - -
1.0100 3550 0.0039 - -
1.0128 3560 0.0039 - -
1.0156 3570 0.004 - -
1.0185 3580 0.0035 - -
1.0213 3590 0.0036 - -
1.0242 3600 0.004 - -
1.0270 3610 0.0039 - -
1.0299 3620 0.0042 - -
1.0327 3630 0.0038 - -
1.0356 3640 0.004 - -
1.0384 3650 0.0038 - -
1.0413 3660 0.0039 - -
1.0441 3670 0.0037 - -
1.0469 3680 0.0039 - -
1.0498 3690 0.0037 - -
1.0526 3700 0.0038 - -
1.0555 3710 0.0036 - -
1.0583 3720 0.0035 - -
1.0612 3730 0.0038 - -
1.0640 3740 0.0032 - -
1.0669 3750 0.0038 0.0408 0.8405
1.0697 3760 0.0034 - -
1.0725 3770 0.0037 - -
1.0754 3780 0.0036 - -
1.0782 3790 0.0038 - -
1.0811 3800 0.0038 - -
1.0839 3810 0.0033 - -
1.0868 3820 0.0039 - -
1.0896 3830 0.0034 - -
1.0925 3840 0.0035 - -
1.0953 3850 0.0036 - -
1.0982 3860 0.004 - -
1.1010 3870 0.0038 - -
1.1038 3880 0.0032 - -
1.1067 3890 0.0036 - -
1.1095 3900 0.0033 - -
1.1124 3910 0.0038 - -
1.1152 3920 0.0034 - -
1.1181 3930 0.0034 - -
1.1209 3940 0.0031 - -
1.1238 3950 0.0041 - -
1.1266 3960 0.0038 - -
1.1294 3970 0.0033 - -
1.1323 3980 0.0037 - -
1.1351 3990 0.0035 - -
1.1380 4000 0.0034 0.0403 0.8428
1.1408 4010 0.0033 - -
1.1437 4020 0.0035 - -
1.1465 4030 0.0041 - -
1.1494 4040 0.0036 - -
1.1522 4050 0.0035 - -
1.1551 4060 0.0038 - -
1.1579 4070 0.0034 - -
1.1607 4080 0.003 - -
1.1636 4090 0.0038 - -
1.1664 4100 0.0035 - -
1.1693 4110 0.0036 - -
1.1721 4120 0.0036 - -
1.1750 4130 0.0035 - -
1.1778 4140 0.004 - -
1.1807 4150 0.003 - -
1.1835 4160 0.0036 - -
1.1864 4170 0.004 - -
1.1892 4180 0.0034 - -
1.1920 4190 0.0035 - -
1.1949 4200 0.004 - -
1.1977 4210 0.0037 - -
1.2006 4220 0.0037 - -
1.2034 4230 0.0032 - -
1.2063 4240 0.0035 - -
1.2091 4250 0.0035 0.0408 0.8411
1.2120 4260 0.0033 - -
1.2148 4270 0.0039 - -
1.2176 4280 0.0037 - -
1.2205 4290 0.0036 - -
1.2233 4300 0.0033 - -
1.2262 4310 0.0034 - -
1.2290 4320 0.0033 - -
1.2319 4330 0.0034 - -
1.2347 4340 0.0035 - -
1.2376 4350 0.0035 - -
1.2404 4360 0.003 - -
1.2433 4370 0.0037 - -
1.2461 4380 0.0035 - -
1.2489 4390 0.0033 - -
1.2518 4400 0.0033 - -
1.2546 4410 0.0033 - -
1.2575 4420 0.0034 - -
1.2603 4430 0.0032 - -
1.2632 4440 0.0032 - -
1.2660 4450 0.0033 - -
1.2689 4460 0.0031 - -
1.2717 4470 0.0033 - -
1.2745 4480 0.0033 - -
1.2774 4490 0.0027 - -
1.2802 4500 0.0035 0.0418 0.8422
1.2831 4510 0.0033 - -
1.2859 4520 0.0035 - -
1.2888 4530 0.0031 - -
1.2916 4540 0.0031 - -
1.2945 4550 0.003 - -
1.2973 4560 0.0035 - -
1.3002 4570 0.0034 - -
1.3030 4580 0.003 - -
1.3058 4590 0.0036 - -
1.3087 4600 0.0032 - -
1.3115 4610 0.0033 - -
1.3144 4620 0.0031 - -
1.3172 4630 0.0032 - -
1.3201 4640 0.0032 - -
1.3229 4650 0.0031 - -
1.3258 4660 0.0035 - -
1.3286 4670 0.003 - -
1.3314 4680 0.0033 - -
1.3343 4690 0.0032 - -
1.3371 4700 0.0033 - -
1.3400 4710 0.003 - -
1.3428 4720 0.0032 - -
1.3457 4730 0.0035 - -
1.3485 4740 0.0034 - -
1.3514 4750 0.003 0.0396 0.8409
1.3542 4760 0.0032 - -
1.3571 4770 0.0033 - -
1.3599 4780 0.0032 - -
1.3627 4790 0.003 - -
1.3656 4800 0.0028 - -
1.3684 4810 0.0031 - -
1.3713 4820 0.0033 - -
1.3741 4830 0.003 - -
1.3770 4840 0.0032 - -
1.3798 4850 0.003 - -
1.3827 4860 0.0034 - -
1.3855 4870 0.0028 - -
1.3883 4880 0.0029 - -
1.3912 4890 0.003 - -
1.3940 4900 0.0032 - -
1.3969 4910 0.003 - -
1.3997 4920 0.0032 - -
1.4026 4930 0.0033 - -
1.4054 4940 0.0031 - -
1.4083 4950 0.0029 - -
1.4111 4960 0.0032 - -
1.4140 4970 0.0035 - -
1.4168 4980 0.0032 - -
1.4196 4990 0.0034 - -
1.4225 5000 0.0032 0.0440 0.8409
1.4253 5010 0.0034 - -
1.4282 5020 0.0029 - -
1.4310 5030 0.0034 - -
1.4339 5040 0.0031 - -
1.4367 5050 0.0033 - -
1.4396 5060 0.003 - -
1.4424 5070 0.003 - -
1.4453 5080 0.0028 - -
1.4481 5090 0.003 - -
1.4509 5100 0.003 - -
1.4538 5110 0.0031 - -
1.4566 5120 0.003 - -
1.4595 5130 0.003 - -
1.4623 5140 0.0032 - -
1.4652 5150 0.0029 - -
1.4680 5160 0.0029 - -
1.4709 5170 0.0031 - -
1.4737 5180 0.0032 - -
1.4765 5190 0.0031 - -
1.4794 5200 0.0027 - -
1.4822 5210 0.0029 - -
1.4851 5220 0.003 - -
1.4879 5230 0.0027 - -
1.4908 5240 0.0031 - -
1.4936 5250 0.0032 0.0432 0.8411
1.4965 5260 0.0028 - -
1.4993 5270 0.0029 - -
1.5022 5280 0.0029 - -
1.5050 5290 0.0027 - -
1.5078 5300 0.0028 - -
1.5107 5310 0.0028 - -
1.5135 5320 0.003 - -
1.5164 5330 0.003 - -
1.5192 5340 0.0029 - -
1.5221 5350 0.0027 - -
1.5249 5360 0.003 - -
1.5278 5370 0.0026 - -
1.5306 5380 0.0028 - -
1.5334 5390 0.0032 - -
1.5363 5400 0.0027 - -
1.5391 5410 0.0033 - -
1.5420 5420 0.003 - -
1.5448 5430 0.0028 - -
1.5477 5440 0.0029 - -
1.5505 5450 0.0028 - -
1.5534 5460 0.003 - -
1.5562 5470 0.0024 - -
1.5591 5480 0.003 - -
1.5619 5490 0.0028 - -
1.5647 5500 0.003 0.0398 0.8398
1.5676 5510 0.0026 - -
1.5704 5520 0.0031 - -
1.5733 5530 0.0028 - -
1.5761 5540 0.003 - -
1.5790 5550 0.0027 - -
1.5818 5560 0.0027 - -
1.5847 5570 0.0027 - -
1.5875 5580 0.0028 - -
1.5903 5590 0.0026 - -
1.5932 5600 0.0026 - -
1.5960 5610 0.0029 - -
1.5989 5620 0.0028 - -
1.6017 5630 0.0028 - -
1.6046 5640 0.0029 - -
1.6074 5650 0.0032 - -
1.6103 5660 0.0026 - -
1.6131 5670 0.0029 - -
1.6160 5680 0.0027 - -
1.6188 5690 0.0029 - -
1.6216 5700 0.0028 - -
1.6245 5710 0.0029 - -
1.6273 5720 0.003 - -
1.6302 5730 0.0026 - -
1.6330 5740 0.0028 - -
1.6359 5750 0.0024 0.0422 0.8383
1.6387 5760 0.0026 - -
1.6416 5770 0.003 - -
1.6444 5780 0.0028 - -
1.6472 5790 0.0024 - -
1.6501 5800 0.0028 - -
1.6529 5810 0.0026 - -
1.6558 5820 0.0026 - -
1.6586 5830 0.0026 - -
1.6615 5840 0.0027 - -
1.6643 5850 0.0028 - -
1.6672 5860 0.0029 - -
1.6700 5870 0.0026 - -
1.6729 5880 0.0027 - -
1.6757 5890 0.0029 - -
1.6785 5900 0.0027 - -
1.6814 5910 0.0027 - -
1.6842 5920 0.0026 - -
1.6871 5930 0.0029 - -
1.6899 5940 0.0028 - -
1.6928 5950 0.0033 - -
1.6956 5960 0.0025 - -
1.6985 5970 0.0026 - -
1.7013 5980 0.0026 - -
1.7042 5990 0.0025 - -
1.7070 6000 0.0027 0.0413 0.8409
1.7098 6010 0.0028 - -
1.7127 6020 0.0026 - -
1.7155 6030 0.0027 - -
1.7184 6040 0.0031 - -
1.7212 6050 0.0027 - -
1.7241 6060 0.0027 - -
1.7269 6070 0.0026 - -
1.7298 6080 0.0027 - -
1.7326 6090 0.0026 - -
1.7354 6100 0.0027 - -
1.7383 6110 0.0027 - -
1.7411 6120 0.0026 - -
1.7440 6130 0.0024 - -
1.7468 6140 0.0026 - -
1.7497 6150 0.0027 - -
1.7525 6160 0.0026 - -
1.7554 6170 0.0026 - -
1.7582 6180 0.0026 - -
1.7611 6190 0.0024 - -
1.7639 6200 0.0029 - -
1.7667 6210 0.0024 - -
1.7696 6220 0.0026 - -
1.7724 6230 0.0027 - -
1.7753 6240 0.0028 - -
1.7781 6250 0.0028 0.0400 0.8384
1.7810 6260 0.0026 - -
1.7838 6270 0.0026 - -
1.7867 6280 0.0027 - -
1.7895 6290 0.0026 - -
1.7923 6300 0.0026 - -
1.7952 6310 0.0025 - -
1.7980 6320 0.0026 - -
1.8009 6330 0.0023 - -
1.8037 6340 0.0027 - -
1.8066 6350 0.0027 - -
1.8094 6360 0.0027 - -
1.8123 6370 0.0027 - -
1.8151 6380 0.0026 - -
1.8180 6390 0.0025 - -
1.8208 6400 0.0026 - -
1.8236 6410 0.0022 - -
1.8265 6420 0.0028 - -
1.8293 6430 0.0026 - -
1.8322 6440 0.0026 - -
1.8350 6450 0.0025 - -
1.8379 6460 0.0025 - -
1.8407 6470 0.0025 - -
1.8436 6480 0.0027 - -
1.8464 6490 0.0028 - -
1.8492 6500 0.0022 0.0406 0.8396
1.8521 6510 0.0024 - -
1.8549 6520 0.0026 - -
1.8578 6530 0.0027 - -
1.8606 6540 0.0026 - -
1.8635 6550 0.0026 - -
1.8663 6560 0.0026 - -
1.8692 6570 0.0026 - -
1.8720 6580 0.0026 - -
1.8749 6590 0.0026 - -
1.8777 6600 0.0025 - -
1.8805 6610 0.0024 - -
1.8834 6620 0.0025 - -
1.8862 6630 0.0025 - -
1.8891 6640 0.0024 - -
1.8919 6650 0.0024 - -
1.8948 6660 0.0023 - -
1.8976 6670 0.0024 - -
1.9005 6680 0.0024 - -
1.9033 6690 0.0024 - -
1.9061 6700 0.0023 - -
1.9090 6710 0.0027 - -
1.9118 6720 0.0024 - -
1.9147 6730 0.0025 - -
1.9175 6740 0.0025 - -
1.9204 6750 0.0025 0.0385 0.8421
1.9232 6760 0.0026 - -
1.9261 6770 0.0024 - -
1.9289 6780 0.0024 - -
1.9318 6790 0.0025 - -
1.9346 6800 0.0025 - -
1.9374 6810 0.0024 - -
1.9403 6820 0.0023 - -
1.9431 6830 0.0023 - -
1.9460 6840 0.0025 - -
1.9488 6850 0.0023 - -
1.9517 6860 0.0022 - -
1.9545 6870 0.0025 - -
1.9574 6880 0.0024 - -
1.9602 6890 0.0025 - -
1.9630 6900 0.0027 - -
1.9659 6910 0.0024 - -
1.9687 6920 0.0025 - -
1.9716 6930 0.0023 - -
1.9744 6940 0.0022 - -
1.9773 6950 0.0022 - -
1.9801 6960 0.0025 - -
1.9830 6970 0.0022 - -
1.9858 6980 0.0024 - -
1.9887 6990 0.0024 - -
1.9915 7000 0.0023 0.0390 0.8406
1.9943 7010 0.0023 - -
1.9972 7020 0.0023 - -

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.3.1
  • Transformers: 4.48.0.dev0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.0
  • Datasets: 3.1.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
2
Safetensors
Model size
184M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for x2bee/ModernBERT-SimCSE-multitask_v03-beta

Dataset used to train x2bee/ModernBERT-SimCSE-multitask_v03-beta

Evaluation results