SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	'Reasoning:\nhallucination - The answer introduces information that is not found in the document, which indicates that it is hallucinating.\nEvaluation:' 'Reasoning:\nThe answer provided is mostly aligned with the content of the document, discussing pulse checking as a rough method to estimate if systolic blood pressure is relatively normal. However, the mention of checking after moderate activity seems slightly misrepresented compared to the source material. The source also provides minor additional context and disclaimers that the answer partially addresses.\n\nFinal Evaluation:' "Reasoning:\n- Well-Supported: The answer correctly explains the flexibility in holidays, including the 4-6 weeks off, the requirement for a 2-week consecutive break, and the need for clear communication, which stems from the documents.\n- Specificity: The answer provides specific details about the holiday policy at ORGANIZATION, reflecting what's stated in the document.\n- Conciseness: The answer is clear and to the point, covering all the necessary aspects of the flexible holiday policy without unnecessary details.\n\nEvaluation:"
0	'Reasoning:\nirrelevant - The answer provided does not relate to the document or the specific question asked.\nEvaluation:' 'Reasoning:\nThe given answer sufficiently explains the referral bonus structure, including specific amounts for typical and difficult-to-fill roles, eligibility criteria, and the referral process. It also mentions that certain roles (e.g., hiring managers) are excluded from receiving bonuses.\n\nEvaluation:' "Reasoning:\ncontext grounding - The answer is well-supported by the document, although some specific points, such as drinking ice water, weren't explicitly mentioned.\nrelevance - The answer is directly related to the specific question asked.\nconciseness - While the answer is quite detailed, it remains focused and does not deviate into unrelated topics, making it concise enough given the context.\n\nEvaluation:"

Label

Examples

'Reasoning:\nhallucination - The answer introduces information that is not found in the document, which indicates that it is hallucinating.\nEvaluation:'
'Reasoning:\nThe answer provided is mostly aligned with the content of the document, discussing pulse checking as a rough method to estimate if systolic blood pressure is relatively normal. However, the mention of checking after moderate activity seems slightly misrepresented compared to the source material. The source also provides minor additional context and disclaimers that the answer partially addresses.\n\nFinal Evaluation:'
"Reasoning:\n- Well-Supported: The answer correctly explains the flexibility in holidays, including the 4-6 weeks off, the requirement for a 2-week consecutive break, and the need for clear communication, which stems from the documents.\n- Specificity: The answer provides specific details about the holiday policy at ORGANIZATION, reflecting what's stated in the document.\n- Conciseness: The answer is clear and to the point, covering all the necessary aspects of the flexible holiday policy without unnecessary details.\n\nEvaluation:"

'Reasoning:\nirrelevant - The answer provided does not relate to the document or the specific question asked.\nEvaluation:'
'Reasoning:\nThe given answer sufficiently explains the referral bonus structure, including specific amounts for typical and difficult-to-fill roles, eligibility criteria, and the referral process. It also mentions that certain roles (e.g., hiring managers) are excluded from receiving bonuses.\n\nEvaluation:'
"Reasoning:\ncontext grounding - The answer is well-supported by the document, although some specific points, such as drinking ice water, weren't explicitly mentioned.\nrelevance - The answer is directly related to the specific question asked.\nconciseness - While the answer is quite detailed, it remains focused and does not deviate into unrelated topics, making it concise enough given the context.\n\nEvaluation:"

Evaluation

Metrics

Label	Accuracy
all	0.7612

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_newrelic_gpt-4o_improved-cot-instructions_chat_few_shot_remove_final_eval")
# Run inference
preds = model("Reasoning:
The answer is accurately grounded in the provided document and directly addresses the question without deviating into unrelated topics. The email address for contacting regarding travel reimbursement questions is correctly cited from the document.

Final evaluation:")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	38.1107	148

Label	Training Sample Count
0	111
1	133

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0016	1	0.2275	-
0.0820	50	0.2565	-
0.1639	100	0.2275	-
0.2459	150	0.1873	-
0.3279	200	0.1281	-
0.4098	250	0.0495	-
0.4918	300	0.0251	-
0.5738	350	0.0142	-
0.6557	400	0.0181	-
0.7377	450	0.0188	-
0.8197	500	0.0111	-
0.9016	550	0.0098	-
0.9836	600	0.0111	-
1.0656	650	0.0108	-
1.1475	700	0.0135	-
1.2295	750	0.0102	-
1.3115	800	0.0119	-
1.3934	850	0.0086	-
1.4754	900	0.0085	-
1.5574	950	0.0089	-
1.6393	1000	0.0101	-
1.7213	1050	0.0121	-
1.8033	1100	0.0112	-
1.8852	1150	0.0122	-
1.9672	1200	0.0099	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.44.0
PyTorch: 2.4.0+cu121
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Netta1994
/

setfit_baai_newrelic_gpt-4o_improved-cot-instructions_chat_few_shot_remove_final_eval