Edit model card

BERT base trained on 500k Arabic NLI triplets

This is a sentence-transformers model finetuned from aubmindlab/bert-base-arabertv02. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: aubmindlab/bert-base-arabertv02
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Language: ar
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'هل يغطي العلاج الطبي (أ) أو (ب) تكلفة المعينات السمعية',
    'يغطي الجزء ب من برنامج Medicare (التأمين الطبي) فحوصات السمع والتوازن التشخيصية إذا طلب طبيبك أو مقدم رعاية صحية آخر هذه الاختبارات لمعرفة ما إذا كنت بحاجة إلى علاج طبي. لا يغطي برنامج Medicare فحوصات السمع الروتينية أو المعينات السمعية أو اختبارات تركيب المعينات السمعية.',
    'يتم تعريف الإعاقة غير المرئية ، أو الإعاقة الخفية ، على أنها إعاقات لا تظهر على الفور. قد لا يكون من الواضح أن بعض الأشخاص الذين يعانون من إعاقات بصرية أو سمعية لا يرتدون نظارات أو أجهزة سمعية أو أجهزة سمعية سرية. قد يرتدي بعض الأشخاص الذين يعانون من فقدان البصر العدسات اللاصقة.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss loss
0.0032 100 4.5441 -
0.0064 200 3.7811 -
0.0096 300 3.0045 -
0.0128 400 2.3688 -
0.016 500 2.0872 -
0.0192 600 1.7032 -
0.0224 700 1.3272 -
0.0256 800 1.4802 -
0.0288 900 1.3168 -
0.032 1000 1.2066 -
0.0352 1100 1.0177 -
0.0384 1200 1.1351 -
0.0416 1300 1.113 -
0.0448 1400 1.0942 -
0.048 1500 0.9924 -
0.0512 1600 1.0132 -
0.0544 1700 0.8718 -
0.0576 1800 0.9367 -
0.0608 1900 0.9507 -
0.064 2000 0.8332 -
0.0672 2100 0.8204 -
0.0704 2200 0.8115 -
0.0736 2300 0.7847 -
0.0768 2400 0.8075 -
0.08 2500 0.7763 -
0.0832 2600 0.795 -
0.0864 2700 0.7992 -
0.0896 2800 0.6968 -
0.0928 2900 0.7747 -
0.096 3000 0.7388 -
0.0992 3100 0.7452 -
0.1024 3200 0.7636 -
0.1056 3300 0.7317 -
0.1088 3400 0.6955 -
0.112 3500 0.618 -
0.1152 3600 0.6321 -
0.1184 3700 0.72 -
0.1216 3800 0.6134 -
0.1248 3900 0.6527 -
0.128 4000 0.6359 -
0.1312 4100 0.6293 -
0.1344 4200 0.7077 -
0.1376 4300 0.6344 -
0.1408 4400 0.7153 -
0.144 4500 0.5617 -
0.1472 4600 0.5975 -
0.1504 4700 0.6195 -
0.1536 4800 0.6643 -
0.1568 4900 0.5301 -
0.16 5000 0.6004 0.5724
0.1632 5100 0.5675 -
0.1664 5200 0.6142 -
0.1696 5300 0.6126 -
0.1728 5400 0.5825 -
0.176 5500 0.5813 -
0.1792 5600 0.5297 -
0.1824 5700 0.5582 -
0.1856 5800 0.4837 -
0.1888 5900 0.6209 -
0.192 6000 0.5778 -
0.1952 6100 0.5522 -
0.1984 6200 0.5854 -
0.2016 6300 0.6199 -
0.2048 6400 0.5157 -
0.208 6500 0.5153 -
0.2112 6600 0.5249 -
0.2144 6700 0.5053 -
0.2176 6800 0.5894 -
0.2208 6900 0.5541 -
0.224 7000 0.4542 -
0.2272 7100 0.5183 -
0.2304 7200 0.6235 -
0.2336 7300 0.5005 -
0.2368 7400 0.5946 -
0.24 7500 0.5288 -
0.2432 7600 0.5249 -
0.2464 7700 0.5884 -
0.2496 7800 0.5656 -
0.2528 7900 0.4746 -
0.256 8000 0.5057 -
0.2592 8100 0.4832 -
0.2624 8200 0.508 -
0.2656 8300 0.5462 -
0.2688 8400 0.4673 -
0.272 8500 0.5126 -
0.2752 8600 0.5257 -
0.2784 8700 0.4994 -
0.2816 8800 0.5081 -
0.2848 8900 0.5148 -
0.288 9000 0.4887 -
0.2912 9100 0.4843 -
0.2944 9200 0.4671 -
0.2976 9300 0.5234 -
0.3008 9400 0.5028 -
0.304 9500 0.527 -
0.3072 9600 0.4727 -
0.3104 9700 0.472 -
0.3136 9800 0.5004 -
0.3168 9900 0.4835 -
0.32 10000 0.4233 0.4415
0.3232 10100 0.4619 -
0.3264 10200 0.4404 -
0.3296 10300 0.4706 -
0.3328 10400 0.481 -
0.336 10500 0.4546 -
0.3392 10600 0.4369 -
0.3424 10700 0.4431 -
0.3456 10800 0.5086 -
0.3488 10900 0.4436 -
0.352 11000 0.4651 -
0.3552 11100 0.4281 -
0.3584 11200 0.487 -
0.3616 11300 0.5097 -
0.3648 11400 0.4658 -
0.368 11500 0.3955 -
0.3712 11600 0.4575 -
0.3744 11700 0.4383 -
0.3776 11800 0.456 -
0.3808 11900 0.4728 -
0.384 12000 0.4027 -
0.3872 12100 0.51 -
0.3904 12200 0.4521 -
0.3936 12300 0.433 -
0.3968 12400 0.4233 -
0.4 12500 0.5328 -
0.4032 12600 0.4671 -
0.4064 12700 0.4673 -
0.4096 12800 0.4387 -
0.4128 12900 0.4661 -
0.416 13000 0.4499 -
0.4192 13100 0.4379 -
0.4224 13200 0.438 -
0.4256 13300 0.4037 -
0.4288 13400 0.4679 -
0.432 13500 0.4373 -
0.4352 13600 0.3899 -
0.4384 13700 0.4288 -
0.4416 13800 0.4388 -
0.4448 13900 0.4482 -
0.448 14000 0.3733 -
0.4512 14100 0.4127 -
0.4544 14200 0.3715 -
0.4576 14300 0.4738 -
0.4608 14400 0.4168 -
0.464 14500 0.4323 -
0.4672 14600 0.4472 -
0.4704 14700 0.4264 -
0.4736 14800 0.4593 -
0.4768 14900 0.4702 -
0.48 15000 0.5111 0.3809
0.4832 15100 0.4558 -
0.4864 15200 0.4334 -
0.4896 15300 0.4352 -
0.4928 15400 0.412 -
0.496 15500 0.4105 -
0.4992 15600 0.4489 -
0.5024 15700 0.4335 -
0.5056 15800 0.4561 -
0.5088 15900 0.4023 -
0.512 16000 0.4175 -
0.5152 16100 0.4041 -
0.5184 16200 0.3707 -
0.5216 16300 0.4348 -
0.5248 16400 0.5013 -
0.528 16500 0.4745 -
0.5312 16600 0.3618 -
0.5344 16700 0.3334 -
0.5376 16800 0.4493 -
0.5408 16900 0.3965 -
0.544 17000 0.3775 -
0.5472 17100 0.4476 -
0.5504 17200 0.3626 -
0.5536 17300 0.3892 -
0.5568 17400 0.4296 -
0.56 17500 0.4048 -
0.5632 17600 0.3933 -
0.5664 17700 0.3831 -
0.5696 17800 0.413 -
0.5728 17900 0.4691 -
0.576 18000 0.3932 -
0.5792 18100 0.3794 -
0.5824 18200 0.4369 -
0.5856 18300 0.3538 -
0.5888 18400 0.3838 -
0.592 18500 0.4549 -
0.5952 18600 0.3524 -
0.5984 18700 0.3645 -
0.6016 18800 0.3574 -
0.6048 18900 0.4043 -
0.608 19000 0.4237 -
0.6112 19100 0.3954 -
0.6144 19200 0.4416 -
0.6176 19300 0.3497 -
0.6208 19400 0.3876 -
0.624 19500 0.4796 -
0.6272 19600 0.3652 -
0.6304 19700 0.3674 -
0.6336 19800 0.3957 -
0.6368 19900 0.3798 -
0.64 20000 0.3862 0.3410
0.6432 20100 0.3603 -
0.6464 20200 0.3934 -
0.6496 20300 0.4268 -
0.6528 20400 0.4032 -
0.656 20500 0.432 -
0.6592 20600 0.4231 -
0.6624 20700 0.34 -
0.6656 20800 0.3865 -
0.6688 20900 0.3877 -
0.672 21000 0.3416 -
0.6752 21100 0.3774 -
0.6784 21200 0.3859 -
0.6816 21300 0.4284 -
0.6848 21400 0.4059 -
0.688 21500 0.3968 -
0.6912 21600 0.3213 -
0.6944 21700 0.3995 -
0.6976 21800 0.3936 -
0.7008 21900 0.4261 -
0.704 22000 0.3689 -
0.7072 22100 0.403 -
0.7104 22200 0.3405 -
0.7136 22300 0.3736 -
0.7168 22400 0.3704 -
0.72 22500 0.4128 -
0.7232 22600 0.3856 -
0.7264 22700 0.3509 -
0.7296 22800 0.3937 -
0.7328 22900 0.3195 -
0.736 23000 0.3048 -
0.7392 23100 0.3909 -
0.7424 23200 0.3446 -
0.7456 23300 0.3051 -
0.7488 23400 0.4251 -
0.752 23500 0.3653 -
0.7552 23600 0.3629 -
0.7584 23700 0.3462 -
0.7616 23800 0.3623 -
0.7648 23900 0.3816 -
0.768 24000 0.3861 -
0.7712 24100 0.4037 -
0.7744 24200 0.4009 -
0.7776 24300 0.3985 -
0.7808 24400 0.3682 -
0.784 24500 0.3544 -
0.7872 24600 0.3623 -
0.7904 24700 0.4221 -
0.7936 24800 0.4016 -
0.7968 24900 0.3713 -
0.8 25000 0.3749 0.3171
0.8032 25100 0.3561 -
0.8064 25200 0.3136 -
0.8096 25300 0.422 -
0.8128 25400 0.3248 -
0.816 25500 0.3054 -
0.8192 25600 0.3646 -
0.8224 25700 0.3846 -
0.8256 25800 0.3679 -
0.8288 25900 0.3224 -
0.832 26000 0.3422 -
0.8352 26100 0.3401 -
0.8384 26200 0.3546 -
0.8416 26300 0.3626 -
0.8448 26400 0.3567 -
0.848 26500 0.3375 -
0.8512 26600 0.361 -
0.8544 26700 0.3525 -
0.8576 26800 0.3264 -
0.8608 26900 0.3663 -
0.864 27000 0.3662 -
0.8672 27100 0.3852 -
0.8704 27200 0.3932 -
0.8736 27300 0.3092 -
0.8768 27400 0.3259 -
0.88 27500 0.3676 -
0.8832 27600 0.3636 -
0.8864 27700 0.34 -
0.8896 27800 0.417 -
0.8928 27900 0.3417 -
0.896 28000 0.2964 -
0.8992 28100 0.3654 -
0.9024 28200 0.3434 -
0.9056 28300 0.308 -
0.9088 28400 0.3453 -
0.912 28500 0.3325 -
0.9152 28600 0.3709 -
0.9184 28700 0.3526 -
0.9216 28800 0.3644 -
0.9248 28900 0.315 -
0.928 29000 0.3538 -
0.9312 29100 0.3551 -
0.9344 29200 0.3523 -
0.9376 29300 0.3401 -
0.9408 29400 0.3935 -
0.944 29500 0.3787 -
0.9472 29600 0.3352 -
0.9504 29700 0.3143 -
0.9536 29800 0.3983 -
0.9568 29900 0.3086 -
0.96 30000 0.3317 0.3043
0.9632 30100 0.3117 -
0.9664 30200 0.3562 -
0.9696 30300 0.372 -
0.9728 30400 0.3217 -
0.976 30500 0.3232 -
0.9792 30600 0.3881 -
0.9824 30700 0.321 -
0.9856 30800 0.3582 -
0.9888 30900 0.3284 -
0.992 31000 0.3274 -
0.9952 31100 0.3201 -
0.9984 31200 0.373 -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.2
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.1
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

Matryoshka2dLoss

@misc{li20242d,
    title={2D Matryoshka Sentence Embeddings},
    author={Xianming Li and Zongxi Li and Jing Li and Haoran Xie and Qing Li},
    year={2024},
    eprint={2402.14776},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for akhooli/sbert_ar_nli_500k

Finetuned
(694)
this model