Edit model card

SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • "Reasoning:\n1. Context Grounding: The answer aligns well with the provided document, specifically discussing coach Brian Shaw's influence and changes in the team strategy, which are mentioned in the text.\n2. Relevance: The response directly addresses the question by focusing on the reasons behind the Nuggets' offensive success in January, such as the new gameplay strategy advocated by the coach and increased comfort and effectiveness.\n3. Conciseness: The answer is mostly concise but adds an unsubstantiated point about virtual reality training, which is not mentioned in the document and should be excluded to maintain briefing relevance.\n\nFinal result: ****."
  • "Reasoning:\n1. Context Grounding: The answer effectively uses specific details from the provided document, discussing the author's experience with digital and film photography, and technical differences such as how each medium handles exposure and color capture.\n2. Relevance: The answer is directly relevant to the question, enumerating specific differences mentioned by the author.\n3. Conciseness: While mostly concise, the answer could have been slightly more succinct. However, it largely avoids unnecessary information and remains clear and to the point.\n\nFinal Result:"
  • "Reasoning:\n\n1. Context Grounding: The answer given details the results of a mixed martial arts event, specifically highlighting Antonio Rogerio Nogueira's victory. However, the question asks about the main conflict in the third book of the Arcana Chronicles by Kresley Cole. There is no relevance in the provided document or the answer to the Arcana Chronicles.\n2. Relevance: The answer does not address the asked question at all. Instead, it provides information about an MMA fight, which is entirely unrelated to the Arcana Chronicles.\n3. Conciseness: While the answer is concise, it fails to answer the appropriate question, thus making its conciseness irrelevant in this context.\n\nFinal Result:"
1
  • 'Reasoning:\n\n1. Context Grounding: The answer provided is well-supported by the document and grounded in the text, which discusses best practices for web designers to avoid unnecessary revisions and conflicts. It specifically addresses parts of the document that highlight getting to know the client, signing a contract, and being honest and diplomatic.\n \n2. Relevance: The answer directly addresses the question of best practices a web designer can incorporate into their client discovery and web design process. It does not deviate into unrelated topics and remains relevant throughout.\n\n3. Conciseness: The answer is clear and concise. It covers the main points without unnecessary elaboration or inclusion of extraneous information.\n\nFinal Result:'
  • "Reasoning:\n\n1. Context Grounding: The answer provided is well-supported by the document. The document discusses the importance of drawing from one's own experiences, particularly those involving pain and emotion, in order to create genuine and relatable characters.\n2. Relevance: The answer directly addresses the question of what the author believes is the key to creating a connection between the reader and the characters.\n3. Conciseness: The answer is clear and to the point, avoiding unnecessary information.\n\nFinal Result:"
  • 'Reasoning:\n1. Context Grounding: The answer directly refers to the document, which mentions Mauro Rubin as the CEO of JoinPad during the event.\n2. Relevance: The answer specifically addresses the question asked about the CEO of JoinPad during the event.\n3. Conciseness: The answer is clear, direct, and does not include unnecessary information.\n\nFinal result:'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_gpt-4o_cot-instructions_remove_final_evaluation_e1_one_big_model_17270799")
# Run inference
preds = model("Reasoning:
1. Context Grounding: The answer provided is directly supported by the document, which states, \"Allan Cox's First Class Delivery on a H128-10W for his Level 1 certification flight.\"
2. Relevance: The answer directly addresses the specific question asked, detailing the rocket and the motor used for Allan Cox's Level 1 certification flight.
3. Conciseness: The answer is clear and to the point, without any extraneous information.

Final result:")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 32 88.2983 198
Label Training Sample Count
0 200
1 209

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0010 1 0.161 -
0.0489 50 0.2637 -
0.0978 100 0.2513 -
0.1466 150 0.151 -
0.1955 200 0.1002 -
0.2444 250 0.0596 -
0.2933 300 0.0383 -
0.3421 350 0.0236 -
0.3910 400 0.0121 -
0.4399 450 0.0075 -
0.4888 500 0.0046 -
0.5376 550 0.0031 -
0.5865 600 0.0029 -
0.6354 650 0.0031 -
0.6843 700 0.0017 -
0.7331 750 0.0016 -
0.7820 800 0.0014 -
0.8309 850 0.0013 -
0.8798 900 0.0014 -
0.9286 950 0.0015 -
0.9775 1000 0.0014 -

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.1.0
  • Sentence Transformers: 3.1.1
  • Transformers: 4.44.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 3.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
3
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Netta1994/setfit_baai_gpt-4o_cot-instructions_remove_final_evaluation_e1_one_big_model_17270799

Finetuned
(247)
this model