metadata
base_model: Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka
datasets:
- Omartificial-Intelligence-Space/Arabic-stsb
language:
- ar
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:947818
- loss:SoftmaxLoss
- loss:CosineSimilarityLoss
widget:
- source_sentence: امرأة تكتب شيئاً
sentences:
- مراهق يتحدث إلى فتاة عبر كاميرا الإنترنت
- امرأة تقطع البصل الأخضر.
- مجموعة من كبار السن يتظاهرون حول طاولة الطعام.
- source_sentence: تتشكل النجوم في مناطق تكوين النجوم، والتي تنشأ نفسها من السحب الجزيئية.
sentences:
- لاعب كرة السلة على وشك تسجيل نقاط لفريقه.
- المقال التالي مأخوذ من نسختي من "أطلس البطريق الجديد للتاريخ الوسطى"
- قد يكون من الممكن أن يوجد نظام شمسي مثل نظامنا خارج المجرة
- source_sentence: >-
تحت السماء الزرقاء مع الغيوم البيضاء، يصل طفل لمس مروحة طائرة واقفة على
حقل من العشب.
sentences:
- امرأة تحمل كأساً
- طفل يحاول لمس مروحة طائرة
- اثنان من عازبين عن الشرب يستعدون للعشاء
- source_sentence: رجل في منتصف العمر يحلق لحيته في غرفة ذات جدران بيضاء والتي لا تبدو كحمام
sentences:
- فتى يخطط اسمه على مكتبه
- رجل ينام
- المرأة وحدها وهي نائمة في غرفة نومها
- source_sentence: الكلب البني مستلقي على جانبه على سجادة بيج، مع جسم أخضر في المقدمة.
sentences:
- شخص طويل القامة
- المرأة تنظر من النافذة.
- لقد مات الكلب
model-index:
- name: >-
SentenceTransformer based on
Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts dev
type: sts-dev
metrics:
- type: pearson_cosine
value: 0.8383581637565862
name: Pearson Cosine
- type: spearman_cosine
value: 0.8389373148442993
name: Spearman Cosine
- type: pearson_manhattan
value: 0.8247947413553784
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.8329104956151686
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.8249963167509389
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.8336591462431132
name: Spearman Euclidean
- type: pearson_dot
value: 0.8071855574990106
name: Pearson Dot
- type: spearman_dot
value: 0.8097706351791779
name: Spearman Dot
- type: pearson_max
value: 0.8383581637565862
name: Pearson Max
- type: spearman_max
value: 0.8389373148442993
name: Spearman Max
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.7907507025363603
name: Pearson Cosine
- type: spearman_cosine
value: 0.7893080660475024
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7923222026451455
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7946838339078852
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7903690631114766
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.793426368251902
name: Spearman Euclidean
- type: pearson_dot
value: 0.7404285389360442
name: Pearson Dot
- type: spearman_dot
value: 0.7353599094850335
name: Spearman Dot
- type: pearson_max
value: 0.7923222026451455
name: Pearson Max
- type: spearman_max
value: 0.7946838339078852
name: Spearman Max
SentenceTransformer based on Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka
This is a sentence-transformers model finetuned from Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka on the all-nli and sts datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
- Training Datasets:
- all-nli
- sts
- Language: ar
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka-multi-task")
# Run inference
sentences = [
'الكلب البني مستلقي على جانبه على سجادة بيج، مع جسم أخضر في المقدمة.',
'لقد مات الكلب',
'شخص طويل القامة',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts-dev
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.8384 |
spearman_cosine | 0.8389 |
pearson_manhattan | 0.8248 |
spearman_manhattan | 0.8329 |
pearson_euclidean | 0.825 |
spearman_euclidean | 0.8337 |
pearson_dot | 0.8072 |
spearman_dot | 0.8098 |
pearson_max | 0.8384 |
spearman_max | 0.8389 |
Semantic Similarity
- Dataset:
sts-test
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.7908 |
spearman_cosine | 0.7893 |
pearson_manhattan | 0.7923 |
spearman_manhattan | 0.7947 |
pearson_euclidean | 0.7904 |
spearman_euclidean | 0.7934 |
pearson_dot | 0.7404 |
spearman_dot | 0.7354 |
pearson_max | 0.7923 |
spearman_max | 0.7947 |
Training Details
Training Datasets
all-nli
- Dataset: all-nli
- Size: 942,069 training samples
- Columns:
premise
,hypothesis
, andlabel
- Approximate statistics based on the first 1000 samples:
premise hypothesis label type string string int details - min: 5 tokens
- mean: 14.09 tokens
- max: 43 tokens
- min: 4 tokens
- mean: 8.28 tokens
- max: 28 tokens
- 0: ~33.40%
- 1: ~33.30%
- 2: ~33.30%
- Samples:
premise hypothesis label شخص على حصان يقفز فوق طائرة معطلة
شخص يقوم بتدريب حصانه للمنافسة
1
شخص على حصان يقفز فوق طائرة معطلة
شخص في مطعم، يطلب عجة.
2
شخص على حصان يقفز فوق طائرة معطلة
شخص في الهواء الطلق، على حصان.
0
- Loss:
SoftmaxLoss
sts
- Dataset: sts at f5a6f89
- Size: 5,749 training samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 4 tokens
- mean: 7.46 tokens
- max: 22 tokens
- min: 4 tokens
- mean: 7.36 tokens
- max: 18 tokens
- min: 0.0
- mean: 0.54
- max: 1.0
- Samples:
sentence1 sentence2 score طائرة ستقلع
طائرة جوية ستقلع
1.0
رجل يعزف على ناي كبير
رجل يعزف على الناي.
0.76
رجل ينشر الجبن الممزق على البيتزا
رجل ينشر الجبن الممزق على بيتزا غير مطبوخة
0.76
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Evaluation Datasets
all-nli
- Dataset: all-nli
- Size: 1,000 evaluation samples
- Columns:
premise
,hypothesis
, andlabel
- Approximate statistics based on the first 1000 samples:
premise hypothesis label type string string int details - min: 5 tokens
- mean: 15.1 tokens
- max: 48 tokens
- min: 4 tokens
- mean: 8.11 tokens
- max: 21 tokens
- 0: ~33.10%
- 1: ~33.30%
- 2: ~33.60%
- Samples:
premise hypothesis label امرأتان يتعانقان بينما يحملان طرود
الأخوات يعانقون بعضهم لوداعاً بينما يحملون حزمة بعد تناول الغداء
1
امرأتان يتعانقان بينما يحملان حزمة
إمرأتان يحملان حزمة
0
امرأتان يتعانقان بينما يحملان حزمة
الرجال يتشاجرون خارج مطعم
2
- Loss:
SoftmaxLoss
sts
- Dataset: sts at f5a6f89
- Size: 1,500 evaluation samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 4 tokens
- mean: 12.55 tokens
- max: 42 tokens
- min: 4 tokens
- mean: 12.49 tokens
- max: 54 tokens
- min: 0.0
- mean: 0.47
- max: 1.0
- Samples:
sentence1 sentence2 score رجل يرتدي قبعة صلبة يرقص
رجل يرتدي قبعة صلبة يرقص.
1.0
طفل صغير يركب حصاناً.
طفل يركب حصاناً.
0.95
رجل يطعم فأراً لأفعى
الرجل يطعم الفأر للثعبان.
1.0
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 1warmup_ratio
: 0.1fp16
: Truemulti_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss | all-nli loss | sts loss | sts-dev_spearman_cosine | sts-test_spearman_cosine |
---|---|---|---|---|---|---|
0.1389 | 100 | 0.5848 | 1.0957 | 0.0324 | 0.8309 | - |
0.2778 | 200 | 0.5243 | 0.9695 | 0.0294 | 0.8386 | - |
0.4167 | 300 | 0.5135 | 0.9486 | 0.0295 | 0.8398 | - |
0.5556 | 400 | 0.4896 | 0.9366 | 0.0305 | 0.8317 | - |
0.6944 | 500 | 0.5048 | 0.9201 | 0.0298 | 0.8395 | - |
0.8333 | 600 | 0.4862 | 0.8885 | 0.0291 | 0.8370 | - |
0.9722 | 700 | 0.4628 | 0.8893 | 0.0289 | 0.8389 | - |
1.0 | 720 | - | - | - | - | 0.7893 |
Framework Versions
- Python: 3.9.18
- Sentence Transformers: 3.0.1
- Transformers: 4.42.4
- PyTorch: 2.2.2+cu121
- Accelerate: 0.26.1
- Datasets: 2.19.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers and SoftmaxLoss
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}