metadata
base_model: BAAI/bge-base-en-v1.5
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: >-
Evaluation:
1. **Context Grounding**: The answer references the document, but it
consists of multiple points that are not directly related to accessing
training resources. While some methods mentioned might facilitate training
(e.g., learning budget), others (like using 1Password, Tresorit) are not
directly relevant to accessing training resources.
2. **Relevance**: The answer partially addresses the question. Points
about accessing documents or requesting a learning budget are somewhat
related, but the inclusion of security tools and password managers is
irrelevant to the question of accessing training resources.
3. **Conciseness**: The answer includes unnecessary details that do not
directly answer the question, making it lengthy and off-point in segments.
4. **Specificity**: The answer is a mix of specific steps and unrelated
information. It fails to provide a direct method to access training
resources comprehensively.
5. **General Tips**: The answer does provide a step about talking to a
lead for a learning budget, which can be considered a relevant tip.
However, it's buried in a lot of unrelated content.
Overall, the answer deviates too much from the specific question about
accessing training resources and includes information not requested nor
directly relevant.
Final result: **Bad**
- text: >-
Evaluation:
The answer provided is concise and directly addresses the question of whom
to contact for travel reimbursement questions. It correctly refers to the
email address provided in the document. The answer is well-supported by
the document and does not deviate into unrelated topics.
The final evaluation: Good
- text: >-
Reasoning:
1. **Context Grounding**: The answer accurately references content from
the provided documents, especially the points related to actively thinking
about the possibility of someone leaving, flagging it to HR, analyzing
problems, and providing feedback.
2. **Relevance**: The answer directly addresses the question by outlining
why it is crucial for team leads to consider the possibility of staff
leaving and the steps they can take to mitigate issues early on.
3. **Conciseness**: The answer is relatively concise but gets slightly
verbose towards the end. The core points are clearly made without too much
unnecessary information.
4. **Specificity**: The answer includes specific reasons like addressing
underperformance, lack of growth, disagreement with direction, and
maintaining a supportive environment, all of which are well-supported by
the documents.
5. **Avoiding Generality**: The answer provides detailed steps and reasons
as mentioned in the documents, avoiding overly general statements.
Final Result: **Good**
- text: >-
Evaluation:
The answer addresses the question partially by suggesting ways to learn
about ORGANIZATION through their website and available job postings, which
are referenced in the documents. However, it misses more specific ways to
understand their products and challenges; for example, details from the
document about stress management, inclusivity issues, and organizational
changes could provide better insights into their future and challenges.
Additionally, a newsletter about "behind the scenes" is more about updates
rather than specific details on products and challenges. Therefore, the
answer lacks completeness and specificity.
The final evaluation: Bad
- text: >-
**Evaluation Reasoning:**
1. **Context Grounding:** The answer is well-supported by the provided
documents, especially Document 1. It correctly identifies the
responsibilities of ORGANIZATION_2, Thomas Barnes, and Charlotte Herrera.
2. **Relevance:** The answer addresses the specific question about the
role of ORGANIZATION_2 in the farewell process.
3. **Conciseness:** The response includes some redundant information,
especially the repetition of individuals involved and some operational
details not directly related to the extent of ORGANIZATION_2's
involvement.
4. **Specificity:** It clearly outlines the roles and describes the
process for different situations, including handling paperwork, process,
and tough conversations.
5. **General vs. Specific Tips:** The answer could have been more concise
regarding the involvement extent but remains within bounds.
**Final Evaluation: Good**
inference: true
model-index:
- name: SetFit with BAAI/bge-base-en-v1.5
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.6119402985074627
name: Accuracy
SetFit with BAAI/bge-base-en-v1.5
This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: BAAI/bge-base-en-v1.5
- Classification head: a LogisticRegression instance
- Maximum Sequence Length: 512 tokens
- Number of Classes: 2 classes
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Model Labels
Label | Examples |
---|---|
0 |
|
1 |
|
Evaluation
Metrics
Label | Accuracy |
---|---|
all | 0.6119 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_newrelic_gpt-4o_cot-few_shot-instructions_only_reasoning_1726750400.408156")
# Run inference
preds = model("Evaluation:
The answer provided is concise and directly addresses the question of whom to contact for travel reimbursement questions. It correctly refers to the email address provided in the document. The answer is well-supported by the document and does not deviate into unrelated topics.
The final evaluation: Good")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 30 | 106.3538 | 221 |
Label | Training Sample Count |
---|---|
0 | 32 |
1 | 33 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (5, 5)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0061 | 1 | 0.2332 | - |
0.3067 | 50 | 0.2674 | - |
0.6135 | 100 | 0.2116 | - |
0.9202 | 150 | 0.0354 | - |
1.2270 | 200 | 0.0036 | - |
1.5337 | 250 | 0.0022 | - |
1.8405 | 300 | 0.0017 | - |
2.1472 | 350 | 0.0016 | - |
2.4540 | 400 | 0.0015 | - |
2.7607 | 450 | 0.0013 | - |
3.0675 | 500 | 0.0013 | - |
3.3742 | 550 | 0.0012 | - |
3.6810 | 600 | 0.0012 | - |
3.9877 | 650 | 0.0011 | - |
4.2945 | 700 | 0.0012 | - |
4.6012 | 750 | 0.0011 | - |
4.9080 | 800 | 0.0011 | - |
Framework Versions
- Python: 3.10.14
- SetFit: 1.1.0
- Sentence Transformers: 3.1.0
- Transformers: 4.44.0
- PyTorch: 2.4.1+cu121
- Datasets: 2.19.2
- Tokenizers: 0.19.1
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}