slinger-base

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("carsondial/slinger20241231-3")
# Run inference
sentences = [
    'google side wiki chrome firefox \n\nor \n\ngoogle sidewiki update\n\nor \n\ngoogle sidewiki launch\n\nor \n\ngoogle sidewiki comments\n\nor \n\ngoogle sidewiki browser extension\n\nor \n\nwhat is sidewiki\n\nNote: These queries are based on the content of the document and are intended to reflect the search behavior of a user who has read the document.',
    'Yesterday, Google announced “SideWiki” a new feature of the Firefox and IE browsers (Chrome to come soon) that allows anyone to contribute comments about any webpage –including this one. The impacts are far reaching, now every web page on the internet is social and can have consumer opinion –both positive and negative.\nControl Over the Corporate Website Is Shifting To The Customers:\n- Customers trust each other more than you –now they can assert their voices “on” your webpage. Every webpage on your corporate website, intranet, and extranet are now social. Anyone who accesses these features can now rely on their friends or those who contribute to get additional information. Competitors can link to their competing product, consumers can rate or discuss the positive and negative experiences with your company or product.\n- Yet, don’t expect everyone to participate –or contribute valuable content. While social technology adoption is on the rise, not everyone writes, rates, and contributes content in every location, likely those who have experienced the product, influential, or competitors will be involved. Secondly, content created in this sidebar may be generally useless. To be successful, Google will need it to look more like Wikipedia than YouTube comments\n- Expect Google to integrate this feature with existing systems. Google recently launched profiles, a feature that is the foundation for extending their social reach. With large social networks like Gmail already in place (That’s right, email is a social network) they can eventually sort content on SideWiki by context of friends, experts, or other sources. Google’s strategy is to ‘envelope’ the web this is typical of their approach.\n- Although early, expect other social networks to launch competing features. Facebook has already created an ‘inlay’ so you can view links shared in the Facebook newspage in the context of your friends –expect them to grow this feature out shortly.\nRecommendations for the Web Strategist: Develop a Social Strategy Now\n- Shift your thinking: recognize that you don’t own your corporate website –your customers do. Accept the mindshift that your job is to not only serve up product and corporate content but to also be a platform and enabler for customers to discuss, share, and make suggestions to how you should improve what you offer.\n- Develop a social strategy with dedicated resources. With every webpage now potentially social, you’ll need to develop a process, roles, and policy to ensure you’re monitoring the conversation, participating as you would in blog discussions, and influencing the discussion. 80% of success is developing an internal strategy, providing education before a free-for-all happens with customers and employees.\n- Don’t be reactive to negative content –embrace social content now. Give users the ability to leave social feedback directly on your corporate webpages, or aggregate existing social content. CMS vendors are developing features to enable this, as well as community platform vendors like Kickapps, Pluck, Liveworld’s Livebar offer rapid deployment options.\nI predicted Google would be one of the first to do this, however I expected them to start with Chrome, not FF and IE. Expect this to be a default feature of Chrome –not just a plugin in future efforts.\nUpdate: Just saw an interesting tweet from @prem_k about impacts to CRM. He’s Right. CRM systems (Salesforce, SAP, Oracle, Rightnow and others) will need to aggregate content in Google’s Sidewiki. It’s not just CRM, Brand Monitoring companies (Radian6, Buzzmetrics, Cymfony, Visible Technologies) will also need to “suck in” that data.\nUpdate 2, a few hours later: We should stop to think about how competitors could display ads “on” your corporate site and you couldn’t stop it, why? Take a look at Google’s business model, they envelop and categorize the web, then display ads on it. There’s nothing stopping them from allowing advertisers to put ads on SideWiki as “sponsored” information. For example, Coke could run their latest ads on the Pepsi.com SikeWiki area. HP could run ads on the Dell.com site. This *already* happens in the search engine result pages on Google.com why not in sidewiki?\nUpdate 3, the next day: I just tried out SideWiki to see how it works. I came to this very post and found out that there are already three comments. I left a comment welcoming folks, and it gave me the option to Tweet it, which I did. Here’s what sidewiki looks like, you don’t never have to have the plugin for this to work. Which means that this certainly has lower barriers to adoption. A few other field notes? I no longer have to fuss with captacha on blogs or name/email/url once I’m logged in to SideWiki, I can comment around the web. Secondly, it centralizes all my comments on my Google profile tool. You do see what Google is doing right? They are turning the whole web into a social network.',
    'A business unit is a division or department within an organization that is responsible for a specific task or product. The unit may be responsible for the manufacture of a particular product, the marketing of that product or the accounting of that product. Some businesses have multiple units, and this structure can increase efficiency and responsiveness to the needs of the customer.\nThere are many types of business units, and these units all have their own unique role. For instance, a business unit may be a single person with a singular mission, or a multi-level corporation that is staffed with hundreds of employees. Each type of business entity is regulated differently, and has its own regulations. However, all of them have one thing in common: they are functional and important.\nOne of the main functions of a business unit is to gather information about the target market. To do this, the unit must collect feedback from the marketplace and determine the right approach to take. This process can be accomplished through surveys, focus groups, and even market research. If a business unit is able to identify the best strategy to pursue, it will be able to boost profits.\nBusiness units are also referred to as divisions or departments, and can be either independent or linked to the parent company. Businesses with a diverse customer base will often set up separate business units for each individual market. It’s a good idea to set a specific mission for each of these units to allow for easier management. In addition, having multiple units can be beneficial for project management.\nOne of the most basic duties of a business unit is to maintain a competitive edge. This can be achieved by offering a better quality or price for a given output. For example, a business unit that manufactures boots may produce a more comfortable pair of boots. But if a business unit is not efficient in delivering its services, its costs will rise.\nOther functions performed by a business unit include sales and marketing. When a unit is successful, it improves the organization’s overall performance. Having a clear mission statement is one of the most important things a business unit can do. That mission should be specific, relevant, and measurable.\nIn order to be a success, a unit needs to have a well thought out strategy and a dedicated team of employees. Moreover, the unit must have a clear mission statement that sets the tone for the organization.\nA well-defined mission statement can also be a great way to motivate and encourage employees to perform at their best. This can be done by having a specific mission statement, or by making sure that the mission is aspirational but achievable.\nAnother way to measure the performance of a business unit is through a business unit analysis. This is a review of all of the processes and activities that are performed by the unit. This can be done by the unit manager or by an organizational manager. The objective of this process is to ensure that the organization is not wasting its resources or losing out on opportunities.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_768 dim_512 dim_256 dim_128 dim_64
cosine_accuracy@1 0.576 0.5646 0.5498 0.527 0.4734
cosine_accuracy@3 0.6732 0.6702 0.6578 0.6332 0.5816
cosine_accuracy@5 0.719 0.7144 0.7004 0.676 0.6242
cosine_accuracy@10 0.7634 0.7594 0.7538 0.7338 0.684
cosine_precision@1 0.576 0.5646 0.5498 0.527 0.4734
cosine_precision@3 0.2244 0.2234 0.2193 0.2111 0.1939
cosine_precision@5 0.1438 0.1429 0.1401 0.1352 0.1248
cosine_precision@10 0.0763 0.0759 0.0754 0.0734 0.0684
cosine_recall@1 0.576 0.5646 0.5498 0.527 0.4734
cosine_recall@3 0.6732 0.6702 0.6578 0.6332 0.5816
cosine_recall@5 0.719 0.7144 0.7004 0.676 0.6242
cosine_recall@10 0.7634 0.7594 0.7538 0.7338 0.684
cosine_ndcg@10 0.6664 0.6597 0.6487 0.626 0.5735
cosine_mrr@10 0.6357 0.6281 0.6155 0.592 0.5388
cosine_map@100 0.641 0.6334 0.6206 0.5975 0.5451

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 45,000 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 11.87 tokens
    • max: 208 tokens
    • min: 43 tokens
    • mean: 389.85 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    how to add password screen in wordpress The following wordpress webpage ([url removed, login to view]) has two options depending on which graphic you click on (both are separate pages in wordpress). I want the user to see a password screen when they click on either option, and for there to be a different password for either screen. Once they enter the password they would progress to that screen.
    Hello, nice to meet you. I'm professional in wordpress/html/php/css/js. I have done similar to this project before. I can start soon. Looking forward to connecting and working with you soon. Regards.
    16 freelanceria on tarjonnut keskimäärin 107 £ tähän työhön
    i will work on this project, i have more than three years of web development experience. Development portfolio given below. [url removed, login to view] [url removed, login to view] (Streaming Website with Admin Panel) [url removed, login to view] ( Lisää
    Hello, dear? How are you? I am a software developer in Desktop(C/C++, C#, JAVA, VBA, [url removed, login to view], [url remov...
    landing page monkey review I believe that LandingPage Monkey is a great tool to have in your affiliate toolkit. This page building can create a various types of pages including webinar registration pages, sales pages, exit pages, contest registration pages, and any other type of marketing pages that you can think off. Give LandingPage Monkey a try and if you aren't completely satisfied then they do provide a 30 day money back guarantee.
    - Value For The Money9
    - Beginner Friendly8.5
    - Quality Of The Product8.5
    - FREE Page Hosting For Users9.5
    Landing Page Monkey is our best selling landing page/lead capture page building platform that anyone can use to create amazing looking and attention grabbing pages fast!
    Small businesses with little or no coding and graphic design skills are always struggling while trying to increase their conversion rates and get more sales. They often hire freelance programmers and designers that charge a lot of money for their job.
    But it is way worst when they choose to hire cheap servic...
    wix website builder software WebStarts is everything you need to create and maintain your very own website. Traditionally websites are written in HTML code, that code is stored on a server, and a domain is pointed to it. The process of setting up a traditional website is tedious, technical, and expensive. If you don't know how to code you might hire a web developer. Next, you need to purchase server space. Finally, you need to register a domain. It's a hassle to manage three different bills and three different companies. The whole process is so confusing it leaves a lot of people wondering how to make a website at all.
    These are questions that have fairly non-specific answers. Depending on your type of site, there are different options for improving SEO, for example if you use a CMS then you may find benefit fromt he myriad of SEO plugins available for the given platform. As for the amount of time it takes to see the benefit of changes you may make, that ha a number of variables. As an example, other sites utilizi...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 16
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 16
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10 dim_256_cosine_ndcg@10 dim_128_cosine_ndcg@10 dim_64_cosine_ndcg@10
0.1137 10 3.191 - - - - -
0.2274 20 2.6214 - - - - -
0.3412 30 1.9557 - - - - -
0.4549 40 1.4834 - - - - -
0.5686 50 1.42 - - - - -
0.6823 60 1.3626 - - - - -
0.7960 70 1.1723 - - - - -
0.9097 80 1.2129 - - - - -
0.9893 87 - 0.6616 0.6570 0.6454 0.6177 0.5570
1.0341 90 1.257 - - - - -
1.1478 100 1.1609 - - - - -
1.2615 110 1.0792 - - - - -
1.3753 120 0.9907 - - - - -
1.4890 130 0.8536 - - - - -
1.6027 140 0.8934 - - - - -
1.7164 150 0.9073 - - - - -
1.8301 160 0.8485 - - - - -
1.9439 170 0.878 - - - - -
1.9893 174 - 0.6647 0.6600 0.6472 0.6238 0.5684
2.0682 180 0.922 - - - - -
2.1819 190 0.8154 - - - - -
2.2957 200 0.8993 - - - - -
2.4094 210 0.7296 - - - - -
2.5231 220 0.6828 - - - - -
2.6368 230 0.7187 - - - - -
2.7505 240 0.72 - - - - -
2.8643 250 0.6948 - - - - -
2.9780 260 0.7066 - - - - -
2.9893 261 - 0.666 0.661 0.6493 0.6261 0.5721
3.1023 270 0.7934 - - - - -
3.2161 280 0.701 - - - - -
3.3298 290 0.7146 - - - - -
3.4435 300 0.5952 - - - - -
3.5572 310 0.6048 - - - - -
3.6709 320 0.7172 - - - - -
3.7846 330 0.6414 - - - - -
3.8984 340 0.6422 - - - - -
3.9893 348 - 0.6664 0.6597 0.6487 0.6260 0.5735
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
8
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for carsondial/slinger20241231-3

Finetuned
(325)
this model

Evaluation results