slinger-base
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("carsondial/slinger20241231-3")
# Run inference
sentences = [
'google side wiki chrome firefox \n\nor \n\ngoogle sidewiki update\n\nor \n\ngoogle sidewiki launch\n\nor \n\ngoogle sidewiki comments\n\nor \n\ngoogle sidewiki browser extension\n\nor \n\nwhat is sidewiki\n\nNote: These queries are based on the content of the document and are intended to reflect the search behavior of a user who has read the document.',
'Yesterday, Google announced “SideWiki” a new feature of the Firefox and IE browsers (Chrome to come soon) that allows anyone to contribute comments about any webpage –including this one. The impacts are far reaching, now every web page on the internet is social and can have consumer opinion –both positive and negative.\nControl Over the Corporate Website Is Shifting To The Customers:\n- Customers trust each other more than you –now they can assert their voices “on” your webpage. Every webpage on your corporate website, intranet, and extranet are now social. Anyone who accesses these features can now rely on their friends or those who contribute to get additional information. Competitors can link to their competing product, consumers can rate or discuss the positive and negative experiences with your company or product.\n- Yet, don’t expect everyone to participate –or contribute valuable content. While social technology adoption is on the rise, not everyone writes, rates, and contributes content in every location, likely those who have experienced the product, influential, or competitors will be involved. Secondly, content created in this sidebar may be generally useless. To be successful, Google will need it to look more like Wikipedia than YouTube comments\n- Expect Google to integrate this feature with existing systems. Google recently launched profiles, a feature that is the foundation for extending their social reach. With large social networks like Gmail already in place (That’s right, email is a social network) they can eventually sort content on SideWiki by context of friends, experts, or other sources. Google’s strategy is to ‘envelope’ the web this is typical of their approach.\n- Although early, expect other social networks to launch competing features. Facebook has already created an ‘inlay’ so you can view links shared in the Facebook newspage in the context of your friends –expect them to grow this feature out shortly.\nRecommendations for the Web Strategist: Develop a Social Strategy Now\n- Shift your thinking: recognize that you don’t own your corporate website –your customers do. Accept the mindshift that your job is to not only serve up product and corporate content but to also be a platform and enabler for customers to discuss, share, and make suggestions to how you should improve what you offer.\n- Develop a social strategy with dedicated resources. With every webpage now potentially social, you’ll need to develop a process, roles, and policy to ensure you’re monitoring the conversation, participating as you would in blog discussions, and influencing the discussion. 80% of success is developing an internal strategy, providing education before a free-for-all happens with customers and employees.\n- Don’t be reactive to negative content –embrace social content now. Give users the ability to leave social feedback directly on your corporate webpages, or aggregate existing social content. CMS vendors are developing features to enable this, as well as community platform vendors like Kickapps, Pluck, Liveworld’s Livebar offer rapid deployment options.\nI predicted Google would be one of the first to do this, however I expected them to start with Chrome, not FF and IE. Expect this to be a default feature of Chrome –not just a plugin in future efforts.\nUpdate: Just saw an interesting tweet from @prem_k about impacts to CRM. He’s Right. CRM systems (Salesforce, SAP, Oracle, Rightnow and others) will need to aggregate content in Google’s Sidewiki. It’s not just CRM, Brand Monitoring companies (Radian6, Buzzmetrics, Cymfony, Visible Technologies) will also need to “suck in” that data.\nUpdate 2, a few hours later: We should stop to think about how competitors could display ads “on” your corporate site and you couldn’t stop it, why? Take a look at Google’s business model, they envelop and categorize the web, then display ads on it. There’s nothing stopping them from allowing advertisers to put ads on SideWiki as “sponsored” information. For example, Coke could run their latest ads on the Pepsi.com SikeWiki area. HP could run ads on the Dell.com site. This *already* happens in the search engine result pages on Google.com why not in sidewiki?\nUpdate 3, the next day: I just tried out SideWiki to see how it works. I came to this very post and found out that there are already three comments. I left a comment welcoming folks, and it gave me the option to Tweet it, which I did. Here’s what sidewiki looks like, you don’t never have to have the plugin for this to work. Which means that this certainly has lower barriers to adoption. A few other field notes? I no longer have to fuss with captacha on blogs or name/email/url once I’m logged in to SideWiki, I can comment around the web. Secondly, it centralizes all my comments on my Google profile tool. You do see what Google is doing right? They are turning the whole web into a social network.',
'A business unit is a division or department within an organization that is responsible for a specific task or product. The unit may be responsible for the manufacture of a particular product, the marketing of that product or the accounting of that product. Some businesses have multiple units, and this structure can increase efficiency and responsiveness to the needs of the customer.\nThere are many types of business units, and these units all have their own unique role. For instance, a business unit may be a single person with a singular mission, or a multi-level corporation that is staffed with hundreds of employees. Each type of business entity is regulated differently, and has its own regulations. However, all of them have one thing in common: they are functional and important.\nOne of the main functions of a business unit is to gather information about the target market. To do this, the unit must collect feedback from the marketplace and determine the right approach to take. This process can be accomplished through surveys, focus groups, and even market research. If a business unit is able to identify the best strategy to pursue, it will be able to boost profits.\nBusiness units are also referred to as divisions or departments, and can be either independent or linked to the parent company. Businesses with a diverse customer base will often set up separate business units for each individual market. It’s a good idea to set a specific mission for each of these units to allow for easier management. In addition, having multiple units can be beneficial for project management.\nOne of the most basic duties of a business unit is to maintain a competitive edge. This can be achieved by offering a better quality or price for a given output. For example, a business unit that manufactures boots may produce a more comfortable pair of boots. But if a business unit is not efficient in delivering its services, its costs will rise.\nOther functions performed by a business unit include sales and marketing. When a unit is successful, it improves the organization’s overall performance. Having a clear mission statement is one of the most important things a business unit can do. That mission should be specific, relevant, and measurable.\nIn order to be a success, a unit needs to have a well thought out strategy and a dedicated team of employees. Moreover, the unit must have a clear mission statement that sets the tone for the organization.\nA well-defined mission statement can also be a great way to motivate and encourage employees to perform at their best. This can be done by having a specific mission statement, or by making sure that the mission is aspirational but achievable.\nAnother way to measure the performance of a business unit is through a business unit analysis. This is a review of all of the processes and activities that are performed by the unit. This can be done by the unit manager or by an organizational manager. The objective of this process is to ensure that the organization is not wasting its resources or losing out on opportunities.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|
cosine_accuracy@1 | 0.576 | 0.5646 | 0.5498 | 0.527 | 0.4734 |
cosine_accuracy@3 | 0.6732 | 0.6702 | 0.6578 | 0.6332 | 0.5816 |
cosine_accuracy@5 | 0.719 | 0.7144 | 0.7004 | 0.676 | 0.6242 |
cosine_accuracy@10 | 0.7634 | 0.7594 | 0.7538 | 0.7338 | 0.684 |
cosine_precision@1 | 0.576 | 0.5646 | 0.5498 | 0.527 | 0.4734 |
cosine_precision@3 | 0.2244 | 0.2234 | 0.2193 | 0.2111 | 0.1939 |
cosine_precision@5 | 0.1438 | 0.1429 | 0.1401 | 0.1352 | 0.1248 |
cosine_precision@10 | 0.0763 | 0.0759 | 0.0754 | 0.0734 | 0.0684 |
cosine_recall@1 | 0.576 | 0.5646 | 0.5498 | 0.527 | 0.4734 |
cosine_recall@3 | 0.6732 | 0.6702 | 0.6578 | 0.6332 | 0.5816 |
cosine_recall@5 | 0.719 | 0.7144 | 0.7004 | 0.676 | 0.6242 |
cosine_recall@10 | 0.7634 | 0.7594 | 0.7538 | 0.7338 | 0.684 |
cosine_ndcg@10 | 0.6664 | 0.6597 | 0.6487 | 0.626 | 0.5735 |
cosine_mrr@10 | 0.6357 | 0.6281 | 0.6155 | 0.592 | 0.5388 |
cosine_map@100 | 0.641 | 0.6334 | 0.6206 | 0.5975 | 0.5451 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 45,000 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 1000 samples:
anchor positive type string string details - min: 4 tokens
- mean: 11.87 tokens
- max: 208 tokens
- min: 43 tokens
- mean: 389.85 tokens
- max: 512 tokens
- Samples:
anchor positive how to add password screen in wordpress
The following wordpress webpage ([url removed, login to view]) has two options depending on which graphic you click on (both are separate pages in wordpress). I want the user to see a password screen when they click on either option, and for there to be a different password for either screen. Once they enter the password they would progress to that screen.
Hello, nice to meet you. I'm professional in wordpress/html/php/css/js. I have done similar to this project before. I can start soon. Looking forward to connecting and working with you soon. Regards.
16 freelanceria on tarjonnut keskimäärin 107 £ tähän työhön
i will work on this project, i have more than three years of web development experience. Development portfolio given below. [url removed, login to view] [url removed, login to view] (Streaming Website with Admin Panel) [url removed, login to view] ( Lisää
Hello, dear? How are you? I am a software developer in Desktop(C/C++, C#, JAVA, VBA, [url removed, login to view], [url remov...landing page monkey review
I believe that LandingPage Monkey is a great tool to have in your affiliate toolkit. This page building can create a various types of pages including webinar registration pages, sales pages, exit pages, contest registration pages, and any other type of marketing pages that you can think off. Give LandingPage Monkey a try and if you aren't completely satisfied then they do provide a 30 day money back guarantee.
- Value For The Money9
- Beginner Friendly8.5
- Quality Of The Product8.5
- FREE Page Hosting For Users9.5
Landing Page Monkey is our best selling landing page/lead capture page building platform that anyone can use to create amazing looking and attention grabbing pages fast!
Small businesses with little or no coding and graphic design skills are always struggling while trying to increase their conversion rates and get more sales. They often hire freelance programmers and designers that charge a lot of money for their job.
But it is way worst when they choose to hire cheap servic...wix website builder software
WebStarts is everything you need to create and maintain your very own website. Traditionally websites are written in HTML code, that code is stored on a server, and a domain is pointed to it. The process of setting up a traditional website is tedious, technical, and expensive. If you don't know how to code you might hire a web developer. Next, you need to purchase server space. Finally, you need to register a domain. It's a hassle to manage three different bills and three different companies. The whole process is so confusing it leaves a lot of people wondering how to make a website at all.
These are questions that have fairly non-specific answers. Depending on your type of site, there are different options for improving SEO, for example if you use a CMS then you may find benefit fromt he myriad of SEO plugins available for the given platform. As for the amount of time it takes to see the benefit of changes you may make, that ha a number of variables. As an example, other sites utilizi... - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 32per_device_eval_batch_size
: 16gradient_accumulation_steps
: 16learning_rate
: 2e-05num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1bf16
: Truetf32
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 16eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Truelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|---|
0.1137 | 10 | 3.191 | - | - | - | - | - |
0.2274 | 20 | 2.6214 | - | - | - | - | - |
0.3412 | 30 | 1.9557 | - | - | - | - | - |
0.4549 | 40 | 1.4834 | - | - | - | - | - |
0.5686 | 50 | 1.42 | - | - | - | - | - |
0.6823 | 60 | 1.3626 | - | - | - | - | - |
0.7960 | 70 | 1.1723 | - | - | - | - | - |
0.9097 | 80 | 1.2129 | - | - | - | - | - |
0.9893 | 87 | - | 0.6616 | 0.6570 | 0.6454 | 0.6177 | 0.5570 |
1.0341 | 90 | 1.257 | - | - | - | - | - |
1.1478 | 100 | 1.1609 | - | - | - | - | - |
1.2615 | 110 | 1.0792 | - | - | - | - | - |
1.3753 | 120 | 0.9907 | - | - | - | - | - |
1.4890 | 130 | 0.8536 | - | - | - | - | - |
1.6027 | 140 | 0.8934 | - | - | - | - | - |
1.7164 | 150 | 0.9073 | - | - | - | - | - |
1.8301 | 160 | 0.8485 | - | - | - | - | - |
1.9439 | 170 | 0.878 | - | - | - | - | - |
1.9893 | 174 | - | 0.6647 | 0.6600 | 0.6472 | 0.6238 | 0.5684 |
2.0682 | 180 | 0.922 | - | - | - | - | - |
2.1819 | 190 | 0.8154 | - | - | - | - | - |
2.2957 | 200 | 0.8993 | - | - | - | - | - |
2.4094 | 210 | 0.7296 | - | - | - | - | - |
2.5231 | 220 | 0.6828 | - | - | - | - | - |
2.6368 | 230 | 0.7187 | - | - | - | - | - |
2.7505 | 240 | 0.72 | - | - | - | - | - |
2.8643 | 250 | 0.6948 | - | - | - | - | - |
2.9780 | 260 | 0.7066 | - | - | - | - | - |
2.9893 | 261 | - | 0.666 | 0.661 | 0.6493 | 0.6261 | 0.5721 |
3.1023 | 270 | 0.7934 | - | - | - | - | - |
3.2161 | 280 | 0.701 | - | - | - | - | - |
3.3298 | 290 | 0.7146 | - | - | - | - | - |
3.4435 | 300 | 0.5952 | - | - | - | - | - |
3.5572 | 310 | 0.6048 | - | - | - | - | - |
3.6709 | 320 | 0.7172 | - | - | - | - | - |
3.7846 | 330 | 0.6414 | - | - | - | - | - |
3.8984 | 340 | 0.6422 | - | - | - | - | - |
3.9893 | 348 | - | 0.6664 | 0.6597 | 0.6487 | 0.6260 | 0.5735 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.1
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu121
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for carsondial/slinger20241231-3
Base model
BAAI/bge-base-en-v1.5Evaluation results
- Cosine Accuracy@1 on dim 768self-reported0.576
- Cosine Accuracy@3 on dim 768self-reported0.673
- Cosine Accuracy@5 on dim 768self-reported0.719
- Cosine Accuracy@10 on dim 768self-reported0.763
- Cosine Precision@1 on dim 768self-reported0.576
- Cosine Precision@3 on dim 768self-reported0.224
- Cosine Precision@5 on dim 768self-reported0.144
- Cosine Precision@10 on dim 768self-reported0.076
- Cosine Recall@1 on dim 768self-reported0.576
- Cosine Recall@3 on dim 768self-reported0.673