Edit model card

SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • "Reasoning: \n\n1. Context Grounding: The provided document details the events leading to Joan Gaspart’s resignation and confirms it happened after a poor season in 2003.\n2. Relevance: The answer directly addresses the specific question about who resigned from the presidency after Barcelona's poor showing in the 2003 season.\n3. Conciseness: The answer is clear and to the point without unnecessary information.\n\nFinal Result:"
  • "Reasoning:\n1. Context Grounding: The answer appropriately pulls references from the document, mentioning the hazards of working with electricity and the potential for long-term issues if electrical work isn't done correctly, which aligns with the provided content.\n2. Relevance: The answer directly addresses why it is beneficial to hire a professional electrician like O’Hara Electric, explicitly tying into the concerns of safety, expertise, and ensuring the job is done correctly on the first attempt.\n3. Conciseness: The answer is concise and to the point, avoiding unnecessary information and sticking closely to the reasons why hiring a professional is recommended.\n\nFinal Result:"
  • 'Reasoning:\n1. Context Grounding: The provided document explicitly states that Aerosmith's 1987 comeback album was "Permanent Vacation". The answer is directly supported by this information.\n2. Relevance: The answer is directly related to and completely addresses the question about the title of Aerosmith's 1987 comeback album.\n3. Conciseness: The answer is concise and to the point, providing only the necessary information without any extraneous details.\n\nFinal result:'
0
  • 'Reasoning:\n1. Context Grounding: The response effectively uses the provided document to form its recommendations. It pulls together various tips on identifying and avoiding triggers, utilizing sensory substitutes, and participating in alternative activities to manage cravings. \n2. Relevance: The answer remains focused on the question, directly addressing how to stop cravings for smoking by providing actionable advice and methods.\n3. Conciseness: While informative, the answer could benefit from being slightly more concise. Some points, such as the detailed explanation about licorice root,might be trimmed for brevity.\n\nFinal Result:'
  • 'Reasoning:\n1. Context Grounding: The provided answer is rooted in the document, which mentions that Amy Bloom finds starting a project hard and having to clear mental space, recalibrate, and become less involved in her everyday life.\n2. Relevance: The response accurately focuses on the challenges Bloom faces when starting a significant writing project, without deviating into irrelevant areas.\n3. Conciseness: The answer effectively summarizes the relevant information from the document, staying clear and to the point while avoiding unnecessary detail.\n\nFinal Result:'
  • 'Reasoning:\n1. Context Grounding: The provided document does have a listing for a 6 bedroom detached house. The factual details in the answer given by the user are not present in the document.\n2. Relevance: The user’s answer lists a different price (£2,850,000 vs. £950,000) and an incorrect address (Highgate Lane, Leeds, Berkshire, RG12 vs Willow Drive, Twyford, Reading, Berkshire, RG10) than the provided document.\n3. Conciseness: The user’s answer, though concise, is factually incorrect based on the document.\n\nFinal Result:'

Evaluation

Metrics

Label Accuracy
all 0.8108

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_cot-instructions_remove_final_evaluation_e1_larger_train_1")
# Run inference
preds = model("Reasoning: The given answer is well-supported by the provided document and includes details that are directly relevant to the identification of a funnel spider, such as the dark brown or black body and the presence of a hard, shiny carapace. It also mentions the two large fangs which is a key characteristic. The answer stays focused on the specific question of identifying a funnel spider, without deviating into unrelated topics.

Final Result:")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 33 88.6482 198
Label Training Sample Count
0 95
1 104

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0020 1 0.2432 -
0.1004 50 0.256 -
0.2008 100 0.2208 -
0.3012 150 0.0894 -
0.4016 200 0.0315 -
0.5020 250 0.0065 -
0.6024 300 0.0025 -
0.7028 350 0.0022 -
0.8032 400 0.002 -
0.9036 450 0.002 -

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.1.0
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
8
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Netta1994/setfit_baai_wikisum_gpt-4o_cot-instructions_remove_final_evaluation_e1_larger_train_1

Finetuned
(247)
this model

Evaluation results