SentenceTransformer
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 512 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("llmvetter/embedding_finetune")
# Run inference
sentences = [
'lg 49uk6300plb/49uk6300plb',
'LG 49UK6300PLB',
'Samsung Galaxy J6',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
Product-Category-Retrieval-Test
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.8086 |
cosine_accuracy@3 | 0.9477 |
cosine_accuracy@5 | 0.9644 |
cosine_accuracy@10 | 0.977 |
cosine_precision@1 | 0.8086 |
cosine_precision@3 | 0.3159 |
cosine_precision@5 | 0.1929 |
cosine_precision@10 | 0.0977 |
cosine_recall@1 | 0.8086 |
cosine_recall@3 | 0.9477 |
cosine_recall@5 | 0.9644 |
cosine_recall@10 | 0.977 |
cosine_ndcg@10 | 0.9042 |
cosine_mrr@10 | 0.8796 |
cosine_map@100 | 0.8805 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 3,820 training samples
- Columns:
sentence_0
,sentence_1
,sentence_2
,sentence_3
,sentence_4
,sentence_5
,sentence_6
,sentence_7
,sentence_8
,sentence_9
,sentence_10
,sentence_11
,sentence_12
,sentence_13
,sentence_14
,sentence_15
,sentence_16
,sentence_17
,sentence_18
,sentence_19
,sentence_20
, andsentence_21
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5 sentence_6 sentence_7 sentence_8 sentence_9 sentence_10 sentence_11 sentence_12 sentence_13 sentence_14 sentence_15 sentence_16 sentence_17 sentence_18 sentence_19 sentence_20 sentence_21 type string string string string string string string string string string string string string string string string string string string string string string details - min: 4 tokens
- mean: 18.41 tokens
- max: 47 tokens
- min: 6 tokens
- mean: 10.94 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.11 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.15 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.89 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.89 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.98 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.07 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.04 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.84 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.82 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.81 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.05 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.92 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.18 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.07 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.93 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.02 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.04 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 11.02 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.95 tokens
- max: 30 tokens
- min: 6 tokens
- mean: 10.86 tokens
- max: 30 tokens
- Samples:
sentence_0 sentence_1 sentence_2 sentence_3 sentence_4 sentence_5 sentence_6 sentence_7 sentence_8 sentence_9 sentence_10 sentence_11 sentence_12 sentence_13 sentence_14 sentence_15 sentence_16 sentence_17 sentence_18 sentence_19 sentence_20 sentence_21 sony kd49xf8505bu 49 4k ultra hd tv
Sony Bravia KD-49XF8505
Intel Core i7-8700K 3.7GHz Box
Bosch WAN24100GB
AMD FX-6300 3.5GHz Box
Bosch WIW28500GB
Bosch KGN36VL35G Stainless Steel
Indesit XWDE751480XS
CAT S41 Dual SIM
Sony Xperia XA1 Ultra 32GB
Samsung Galaxy J6
Samsung QE55Q7FN
Bosch KGN39VW35G White
Intel Core i5 7400 3.0GHz Box
Neff C17UR02N0B Stainless Steel
Samsung RR39M7340SA Silver
Samsung RB41J7255SR Stainless Steel
Hoover DXOC 68C3B
Canon PowerShot SX730 HS
Samsung RR39M7340BC Black
Praktica Luxmedia WP240
HP Intel Xeon DP E5506 2.13GHz Socket 1366 800MHz bus Upgrade Tray
doro 8040 4g sim free mobile phone black
Doro 8040
Bosch HMT75M551 Stainless Steel
Bosch SMI50C15GB Silver
Samsung WW90K5413UX
Panasonic Lumix DMC-TZ70
Sony KD-49XF7073
Nikon CoolPix W100
Samsung WD90J6A10AW
Bosch CFA634GS1B Stainless Steel
HP AMD Opteron 8425 HE 2.1GHz Socket F 4800MHz bus Upgrade Tray
Canon EOS 800D + 18-55mm IS STM
Samsung UE50NU7400
Apple iPhone 6S 128GB
Samsung RS52N3313SA/EU Graphite
Bosch WAW325H0GB
Sony Bravia KD-55AF8
Sony Alpha 6500
Doro 5030
LG GSL761WBXV Black
Bosch SMS67MW00G White
AEG L6FBG942R
fridgemaster muz4965 undercounter freezer white a rated
Fridgemaster MUZ4965 White
Samsung UE49NU7100
Nikon CoolPix A10
Samsung UE55NU7100
Samsung QE55Q7FN
Bosch KGN49XL30G Stainless Steel
Samsung UE49NU7500
LG 55UK6300PLB
Hoover DXOC 68C3B
Panasonic Lumix DMC-FZ2000
Panasonic Lumix DMC-TZ80
Bosch WKD28541GB
Apple iPhone 6 32GB
Sony Bravia KDL-32WE613
Lec TF50152W White
Bosch KGV36VW32G White
Bosch WAYH8790GB
Samsung RS68N8240B1/EU Black
Sony Xperia XZ1
HP Intel Xeon DP E5506 2.13GHz Socket 1366 800MHz bus Upgrade Tray
Sharp R372WM White
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 32per_device_eval_batch_size
: 32num_train_epochs
: 8multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 8max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss | Product-Category-Retrieval-Test_cosine_ndcg@10 |
---|---|---|---|
1.0 | 120 | - | 0.7406 |
2.0 | 240 | - | 0.8437 |
3.0 | 360 | - | 0.8756 |
4.0 | 480 | - | 0.8875 |
4.1667 | 500 | 2.5302 | - |
5.0 | 600 | - | 0.8963 |
6.0 | 720 | - | 0.9015 |
7.0 | 840 | - | 0.9042 |
Framework Versions
- Python: 3.11.10
- Sentence Transformers: 3.3.1
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- Cosine Accuracy@1 on Product Category Retrieval Testself-reported0.809
- Cosine Accuracy@3 on Product Category Retrieval Testself-reported0.948
- Cosine Accuracy@5 on Product Category Retrieval Testself-reported0.964
- Cosine Accuracy@10 on Product Category Retrieval Testself-reported0.977
- Cosine Precision@1 on Product Category Retrieval Testself-reported0.809
- Cosine Precision@3 on Product Category Retrieval Testself-reported0.316
- Cosine Precision@5 on Product Category Retrieval Testself-reported0.193
- Cosine Precision@10 on Product Category Retrieval Testself-reported0.098
- Cosine Recall@1 on Product Category Retrieval Testself-reported0.809
- Cosine Recall@3 on Product Category Retrieval Testself-reported0.948