SetFit with BAAI/bge-base-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-base-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'The answer provided does not address the question asked about the lifespan of John Augustine Zahm. It discusses the ownership of the land where the Bell and Gemmell tannery was located, which is unrelated to the specific question.\n\nReasoning:\n1. Context Grounding: The information about John Buchanan and the Bell and Gemmell tannery is grounded in the provided document but does not relate to John Augustine Zahm or his lifespan.\n2. Relevance: The answer does not address the question regarding John Augustine Zahm’s lifespan at all. It is completely off-topic.\n3. Conciseness: While the provided answer is concise, it is entirely irrelevant to the question concerning John Augustine Zahm’s lifespan. \n\nFinal Result:' 'Reasoning: \n\n1. Context Grounding: The answer correctly identifies the federal funds rate as the main tool of conventional monetary policy in the USA. This statement is well-supported by the provided document, which specifies that the federal funds rate is the primary mechanism used in conventional monetary policy in the USA, and further explains its role in interbank lending.\n \n2. Relevance: The question asks specifically about the main tool of conventional monetary policy in the USA. The answer correctly states that the federal funds rate is this tool. However, there is an error in attributing the decision of the federal funds rate to Congress when in fact it is determined by the Federal Reserve.\n\n3. Conciseness: The answer is brief and directly addresses the question without unnecessary information. However, the incorrect statement about Congress deciding the rate detracts from its overall clarity and correctness.\n\nFinal Result:' 'Reasoning:\n1. Context Grounding: The answer partially reflects the document's suggestions about using opponents' momentum and techniques to trip them. However, it introduces methods like "flailing wildly" and "jumping above punches," which are not grounded in the document provided.\n2. Relevance: The answer does not fully align with the document's practical and structured advice for takedowns. It diverges into suggestions that are unrealistic and not mentioned in the document.\n3. Conciseness: The answer is overly detailed in unnecessary ways, such as the emphasis on slow execution and crossing legs, which are not key points mentioned in the document provided. Rather, the document stresses quick and committed actions.\n\nFinal Result:'
1	"Reasoning:\n1. Context Grounding: The answer is directly supported by the content in the document. It correctly identifies the disagreement as being about the duration of the payroll tax cut and highlights the specific durations preferred by both parties.\n2. Relevance: The answer precisely addresses the question by focusing solely on the disagreement about the payroll tax cut's duration between Democrats and Republicans.\n3. Conciseness: The answer is succinct and to the point. It avoids extraneous information, sticking strictly to explaining the core disagreement.\n\nResult:" 'Reasoning:\n\n1. Context Grounding: The provided answer draws directly from the supplied document, mentioning the variety and customizability of templates offered by the organization as well as their qualities, such as including sample content and different features.\n\n2. Relevance: The response is related to the inquiry about available blog post templates. However, while it accurately includes information about the variety of templates and customization options, it should focus more narrowly on templates specifically for blog posts.\n\n3. Conciseness: The answer is somewhat verbose. It includes information that isn’t strictly necessary to answer the question, such as specifics about the overall number of templates and customization options, instead of focusing solely on blog templates.\n\n4. Correct Instructions: The answer does correctly instruct the reader on how to choose a template and its customization, but it lacks specific emphasis on the blog aspect of the templates which is critical to the question.\n\nEvaluation Result:' 'Reasoning:\n\n1. Context Grounding: The document clearly identifies "Father Joseph Carrier" as the person holding the professorship of Chemistry and Physics at Notre Dame, not "Father Josh Carrier."\n2. Relevance: The answer provided states that "Father Josh Carrier" held the professorship, which is incorrect based on the provided document.\n3. Conciseness: While the answer is concise, it is not factually accurate.\n\nFinal Result:'

Label

Examples

'The answer provided does not address the question asked about the lifespan of John Augustine Zahm. It discusses the ownership of the land where the Bell and Gemmell tannery was located, which is unrelated to the specific question.\n\nReasoning:\n1. Context Grounding: The information about John Buchanan and the Bell and Gemmell tannery is grounded in the provided document but does not relate to John Augustine Zahm or his lifespan.\n2. Relevance: The answer does not address the question regarding John Augustine Zahm’s lifespan at all. It is completely off-topic.\n3. Conciseness: While the provided answer is concise, it is entirely irrelevant to the question concerning John Augustine Zahm’s lifespan. \n\nFinal Result:'
'Reasoning: \n\n1. Context Grounding: The answer correctly identifies the federal funds rate as the main tool of conventional monetary policy in the USA. This statement is well-supported by the provided document, which specifies that the federal funds rate is the primary mechanism used in conventional monetary policy in the USA, and further explains its role in interbank lending.\n \n2. Relevance: The question asks specifically about the main tool of conventional monetary policy in the USA. The answer correctly states that the federal funds rate is this tool. However, there is an error in attributing the decision of the federal funds rate to Congress when in fact it is determined by the Federal Reserve.\n\n3. Conciseness: The answer is brief and directly addresses the question without unnecessary information. However, the incorrect statement about Congress deciding the rate detracts from its overall clarity and correctness.\n\nFinal Result:'
'Reasoning:\n1. Context Grounding: The answer partially reflects the document's suggestions about using opponents' momentum and techniques to trip them. However, it introduces methods like "flailing wildly" and "jumping above punches," which are not grounded in the document provided.\n2. Relevance: The answer does not fully align with the document's practical and structured advice for takedowns. It diverges into suggestions that are unrealistic and not mentioned in the document.\n3. Conciseness: The answer is overly detailed in unnecessary ways, such as the emphasis on slow execution and crossing legs, which are not key points mentioned in the document provided. Rather, the document stresses quick and committed actions.\n\nFinal Result:'

"Reasoning:\n1. Context Grounding: The answer is directly supported by the content in the document. It correctly identifies the disagreement as being about the duration of the payroll tax cut and highlights the specific durations preferred by both parties.\n2. Relevance: The answer precisely addresses the question by focusing solely on the disagreement about the payroll tax cut's duration between Democrats and Republicans.\n3. Conciseness: The answer is succinct and to the point. It avoids extraneous information, sticking strictly to explaining the core disagreement.\n\nResult:"
'Reasoning:\n\n1. Context Grounding: The provided answer draws directly from the supplied document, mentioning the variety and customizability of templates offered by the organization as well as their qualities, such as including sample content and different features.\n\n2. Relevance: The response is related to the inquiry about available blog post templates. However, while it accurately includes information about the variety of templates and customization options, it should focus more narrowly on templates specifically for blog posts.\n\n3. Conciseness: The answer is somewhat verbose. It includes information that isn’t strictly necessary to answer the question, such as specifics about the overall number of templates and customization options, instead of focusing solely on blog templates.\n\n4. Correct Instructions: The answer does correctly instruct the reader on how to choose a template and its customization, but it lacks specific emphasis on the blog aspect of the templates which is critical to the question.\n\nEvaluation Result:'
'Reasoning:\n\n1. Context Grounding: The document clearly identifies "Father Joseph Carrier" as the person holding the professorship of Chemistry and Physics at Notre Dame, not "Father Josh Carrier."\n2. Relevance: The answer provided states that "Father Josh Carrier" held the professorship, which is incorrect based on the provided document.\n3. Conciseness: While the answer is concise, it is not factually accurate.\n\nFinal Result:'

Evaluation

Metrics

Label	Accuracy
all	0.7324

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Netta1994/setfit_baai_cybereason_gpt-4o_cot-instructions_remove_final_evaluation_e2_one_out_172")
# Run inference
preds = model("The percentage in the response status column indicates the total amount of successful completion of response actions.

Reasoning:
1. **Context Grounding**: The answer is well-supported by the document which states, \"percentage indicates the total amount of successful completion of response actions.\"
2. **Relevance**: The answer directly addresses the specific question asked about what the percentage in the response status column indicates.
3. **Conciseness**: The answer is succinct and to the point without unnecessary information.
4. **Specificity**: The answer is specific to what is being asked, detailing exactly what the percentage represents.
5. **Accuracy**: The answer provides the correct key/value as per the document.

Final result:")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	32	103.2508	245

Label	Training Sample Count
0	312
1	322

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0006	1	0.2802	-
0.0315	50	0.2661	-
0.0631	100	0.2533	-
0.0946	150	0.2551	-
0.1262	200	0.2561	-
0.1577	250	0.2516	-
0.1893	300	0.2488	-
0.2208	350	0.2216	-
0.2524	400	0.1693	-
0.2839	450	0.1131	-
0.3155	500	0.0797	-
0.3470	550	0.0429	-
0.3785	600	0.029	-
0.4101	650	0.0202	-
0.4416	700	0.0151	-
0.4732	750	0.0167	-
0.5047	800	0.02	-
0.5363	850	0.0118	-
0.5678	900	0.0027	-
0.5994	950	0.0031	-
0.6309	1000	0.0025	-
0.6625	1050	0.0028	-
0.6940	1100	0.0021	-
0.7256	1150	0.0019	-
0.7571	1200	0.0017	-
0.7886	1250	0.0013	-
0.8202	1300	0.0017	-
0.8517	1350	0.0014	-
0.8833	1400	0.0013	-
0.9148	1450	0.0011	-
0.9464	1500	0.0013	-
0.9779	1550	0.0013	-
1.0095	1600	0.0013	-
1.0410	1650	0.0011	-
1.0726	1700	0.0012	-
1.1041	1750	0.001	-
1.1356	1800	0.001	-
1.1672	1850	0.001	-
1.1987	1900	0.001	-
1.2303	1950	0.0009	-
1.2618	2000	0.001	-
1.2934	2050	0.001	-
1.3249	2100	0.001	-
1.3565	2150	0.0009	-
1.3880	2200	0.001	-
1.4196	2250	0.0009	-
1.4511	2300	0.0009	-
1.4826	2350	0.001	-
1.5142	2400	0.0018	-
1.5457	2450	0.0008	-
1.5773	2500	0.0008	-
1.6088	2550	0.0008	-
1.6404	2600	0.0009	-
1.6719	2650	0.0008	-
1.7035	2700	0.0008	-
1.7350	2750	0.0009	-
1.7666	2800	0.0009	-
1.7981	2850	0.0008	-
1.8297	2900	0.0008	-
1.8612	2950	0.0008	-
1.8927	3000	0.0008	-
1.9243	3050	0.0008	-
1.9558	3100	0.0009	-
1.9874	3150	0.0008	-

Framework Versions

Python: 3.10.14
SetFit: 1.1.0
Sentence Transformers: 3.1.1
Transformers: 4.44.0
PyTorch: 2.4.0+cu121
Datasets: 3.0.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Netta1994
/

setfit_baai_cybereason_gpt-4o_cot-instructions_remove_final_evaluation_e2_one_out_172