pritamdeka's picture
Add new SentenceTransformer model.
17a7d2f verified
metadata
base_model: pritamdeka/muril-base-cased-assamese-indicxnli-random-negatives-v1
datasets: []
language: []
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:5749
  - loss:CosineSimilarityLoss
widget:
  - source_sentence: >-
      আমি "... comoving মহাজাগতিক বিশ্ৰাম ফ্ৰেমৰ তুলনাত ... সিংহ নক্ষত্ৰমণ্ডলৰ
      ফালে কিছু 371 কিলোমিটাৰ প্ৰতি ছেকেণ্ডত" আগবাঢ়িছো.
    sentences:
      - বাস্কেটবল খেলুৱৈগৰাকীয়ে নিজৰ দলৰ হৈ পইণ্ট লাভ কৰিবলৈ ওলাইছে।
      - আন কোনো বস্তুৰ লগত আপেক্ষিক নহোৱা কোনো ‘ষ্টিল’ নাই।
      - এজনী ছোৱালীয়ে বতাহ বাদ্যযন্ত্ৰ বজায়।
  - source_sentence: চাৰিটা ল’ৰা-ছোৱালীয়ে ভঁৰালৰ জীৱ-জন্তুবোৰলৈ চাই আছে।
    sentences:
      - ডাইনিং টেবুল এখনৰ চাৰিওফালে বৃদ্ধৰ দল এটাই পোজ দিছে।
      - বিকিনি পিন্ধা চাৰিগৰাকী মহিলাই বিলত ভলীবল খেলি আছে।
      - ল’ৰা-ছোৱালীয়ে ভেড়া চাই।
  - source_sentence: ডালত বহি থকা দুটা টান ঈগল।
    sentences:
      - জাতৰ জেব্ৰা ডানিঅ’ অত্যন্ত কঠোৰ মাছ, ইহঁতক হত্যা কৰাটো প্ৰায় কঠিন।
      - এটা ডালত দুটা ঈগল বহি আছে।
      - >-
        নূন্যতম মজুৰিৰ আইনসমূহে কম দক্ষ, কম উৎপাদনশীল লোকক আটাইতকৈ বেছি আঘাত
        দিয়ে।
  - source_sentence: >-
      "মই আচলতে যি বিচাৰিছো সেয়া হৈছে মুছলমান জনসংখ্যাৰ এটা অনুমান..." @ThanosK
      আৰু @T.E.D., এটা সামগ্ৰিক, সাধাৰণ জনসংখ্যাৰ অনুমান f.e.
    sentences:
      - এগৰাকী মহিলাই সেউজীয়া পিঁয়াজ কাটি আছে।
      - >-
        তলত দিয়া কথাখিনি মোৰ কুকুৰ কাণৰ দৰে কপিৰ পৰা লোৱা হৈছে নিউ পেংগুইন
        এটলাছ অৱ মেডিভেল হিষ্ট্ৰীৰ।
      - আমাৰ দৰে সৌৰজগতৰ কোনো তাৰকাৰাজ্যৰ বাহিৰত থকাটো সম্ভৱ হ’ব পাৰে।
  - source_sentence: ইণ্টাৰনেট কেমেৰাৰ জৰিয়তে এগৰাকী ছোৱালীৰ লগত কথা পাতিলে মানুহজনে।
    sentences:
      - গছৰ শাৰী এটাৰ সন্মুখত পথাৰত ভেড়া চৰিছে।
      - এজন মানুহে গীটাৰ বজাই আছে।
      - ৱেবকেমৰ জৰিয়তে এগৰাকী ছোৱালীৰ সৈতে কথা পাতিছে এজন কিশোৰে।
model-index:
  - name: >-
      SentenceTransformer based on
      pritamdeka/muril-base-cased-assamese-indicxnli-random-negatives-v1
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: pritamdeka/stsb assamese translated dev
          type: pritamdeka/stsb-assamese-translated-dev
        metrics:
          - type: pearson_cosine
            value: 0.8525258323169252
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8506593647943235
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.8334889460288037
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.843042040822402
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.8351723933495433
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8450734552112781
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.8273071926204811
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8277520425148079
            name: Spearman Dot
          - type: pearson_max
            value: 0.8525258323169252
            name: Pearson Max
          - type: spearman_max
            value: 0.8506593647943235
            name: Spearman Max
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: pritamdeka/stsb assamese translated test
          type: pritamdeka/stsb-assamese-translated-test
        metrics:
          - type: pearson_cosine
            value: 0.8138083526567048
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8119367763029309
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.8044112753419641
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8073243490029997
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.805728285628756
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8086070843216111
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.7754575809083841
            name: Pearson Dot
          - type: spearman_dot
            value: 0.7720173359758135
            name: Spearman Dot
          - type: pearson_max
            value: 0.8138083526567048
            name: Pearson Max
          - type: spearman_max
            value: 0.8119367763029309
            name: Spearman Max

SentenceTransformer based on pritamdeka/muril-base-cased-assamese-indicxnli-random-negatives-v1

This is a sentence-transformers model finetuned from pritamdeka/muril-base-cased-assamese-indicxnli-random-negatives-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pritamdeka/muril-base-cased-assamese-indicxnli-random-negatives-v1-sts")
# Run inference
sentences = [
    'ইণ্টাৰনেট কেমেৰাৰ জৰিয়তে এগৰাকী ছোৱালীৰ লগত কথা পাতিলে মানুহজনে।',
    'ৱেবকেমৰ জৰিয়তে এগৰাকী ছোৱালীৰ সৈতে কথা পাতিছে এজন কিশোৰে।',
    'এজন মানুহে গীটাৰ বজাই আছে।',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.8525
spearman_cosine 0.8507
pearson_manhattan 0.8335
spearman_manhattan 0.843
pearson_euclidean 0.8352
spearman_euclidean 0.8451
pearson_dot 0.8273
spearman_dot 0.8278
pearson_max 0.8525
spearman_max 0.8507

Semantic Similarity

Metric Value
pearson_cosine 0.8138
spearman_cosine 0.8119
pearson_manhattan 0.8044
spearman_manhattan 0.8073
pearson_euclidean 0.8057
spearman_euclidean 0.8086
pearson_dot 0.7755
spearman_dot 0.772
pearson_max 0.8138
spearman_max 0.8119

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss pritamdeka/stsb-assamese-translated-dev_spearman_cosine pritamdeka/stsb-assamese-translated-test_spearman_cosine
1.1111 100 0.0331 0.0259 0.8482 -
2.2222 200 0.0176 0.0253 0.8515 -
3.3333 300 0.011 0.0253 0.8513 -
4.4444 400 0.0066 0.0259 0.8492 -
5.5556 500 0.0048 0.0255 0.8511 -
6.6667 600 0.0037 0.0256 0.8508 -
7.7778 700 0.0033 0.0254 0.8515 -
8.8889 800 0.0029 0.0255 0.8512 -
10.0 900 0.0027 0.0257 0.8507 0.8119
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.42.4
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}