metadata

library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
base_model: BAAI/bge-m3
metrics:
  - accuracy
widget:
  - text: >-
      What is the primary difference between a Bayesian neural network and a
      traditional feedforward neural network in the context of machine learning?
  - text: >-
      What is the difference betweensupervised and unsupervised machine learning
      algorithms in terms of data labeling and model training?
  - text: >-
      What is the primary application of Natural Language Processing (NLP) in
      Google's BERT language model, and how does it utilize masked language
      modeling to improve contextual understanding?
  - text: >-
      What is the main advantage of using GraphQL over traditional RESTful APIs,
      as demonstrated by social media giant Facebook in their Facebook ADS API?
  - text: Qui est Robin Mancini ?
pipeline_tag: text-classification
inference: true
model-index:
  - name: SetFit with BAAI/bge-m3
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 1
            name: Accuracy

SetFit with BAAI/bge-m3

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-m3 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-m3
Classification head: a LogisticRegression instance
Maximum Sequence Length: 8192 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
lexical	'What is the definition of semantics in the context ofontology-based data integration, and how does it differ from outright data normalization, as implementented in graph databases like neo4j orAmazon Neptune?' 'What is the primary application of graph convolutional neural networks (GCNNs) in natural language processing (NLP) for modeling syntactic dependencies in parsing?' "What is the distinguising feature of Apache Hive's Metadata Tables, used for maintaining and managingtables in Hadoop Distributed File System (HDFS)?"
semantic	'What is a key challenge faced by managers in sustaining a work culture that encourages creativity, innovation, and critical thinking within the technological industry globally?' 'How might shifting societal values influence the dynamics between multinational corporations and governments, leading to Changes in the global economic landscape?' 'How does the allocation of limited resources affect the allocation of decision-making power within an organization?'

Label

Examples

lexical

'What is the definition of semantics in the context ofontology-based data integration, and how does it differ from outright data normalization, as implementented in graph databases like neo4j orAmazon Neptune?'
'What is the primary application of graph convolutional neural networks (GCNNs) in natural language processing (NLP) for modeling syntactic dependencies in parsing?'
"What is the distinguising feature of Apache Hive's Metadata Tables, used for maintaining and managingtables in Hadoop Distributed File System (HDFS)?"

semantic

'What is a key challenge faced by managers in sustaining a work culture that encourages creativity, innovation, and critical thinking within the technological industry globally?'
'How might shifting societal values influence the dynamics between multinational corporations and governments, leading to Changes in the global economic landscape?'
'How does the allocation of limited resources affect the allocation of decision-making power within an organization?'

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
# Run inference
preds = model("Qui est Robin Mancini ?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	19.1392	56

Label	Training Sample Count
lexical	36
semantic	43

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0050	1	0.1549	-
0.2475	50	0.0045	-
0.4950	100	0.0009	-
0.7426	150	0.0005	-
0.9901	200	0.0005	-
1.0	202	-	0.0001
1.2376	250	0.0006	-
1.4851	300	0.0006	-
1.7327	350	0.0005	-
1.9802	400	0.0004	-
2.0	404	-	0.0
2.2277	450	0.0003	-
2.4752	500	0.0003	-
2.7228	550	0.0003	-
2.9703	600	0.0003	-
3.0	606	-	0.0
3.2178	650	0.0003	-
3.4653	700	0.0004	-
3.7129	750	0.0003	-
3.9604	800	0.0002	-
4.0	808	-	0.0

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.6.1
Transformers: 4.39.0
PyTorch: 2.3.0+cu121
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}