metadata

library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
base_model: sentence-transformers/all-mpnet-base-v2
metrics:
  - accuracy
widget:
  - text: >-
      What is the primary difference between homomorphic encryption and
      multi-party computation in the context of secure multi-party computation
      protocols?
  - text: >-
      How do organizations balance the need for innovation with the potential
      risks and unintended consequences of emerging technologies?
  - text: >-
      How doCompaniesbalanceIndividualCreativitywithTeamCollaboration to
      driveInnovationinthe WORKPlace?
  - text: >-
      How do companies balance the need for innovation with the risk of
      disrupting their existing business models?
  - text: >-
      What is the primary application of Natural Language Processing (NLP) in
      Google's BERT language model, and how does it utilize masked language
      modeling to improve contextual understanding?
pipeline_tag: text-classification
inference: true
model-index:
  - name: SetFit with sentence-transformers/all-mpnet-base-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 1
            name: Accuracy

SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 384 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
semantic	'How do artificial intelligence systems navigate the trade-off between simplicity and accuracy when modeling complex real-world phenomena?' 'How do complex systems, consisting of many interconnected components, give rise to emergent properties that cannot be predicted from the characteristics of their individual parts?' 'How do complex systems, such as those found in nature and human societies, exhibit emergent properties that arise from the interactions of individual components?'
lexical	'What is the primary difference between a generative adversarial network (GAN) and a variational autoencoder (VAE) in deep learning?' 'What is the primary difference between a Decision Tree and a Random Forest in Machine Learning, and how do they alleviate overfitting?' 'What is the primary difference between a Bayesian neural network and a traditional feedforward neural network in the context of machine learning?'

Label

Examples

semantic

'How do artificial intelligence systems navigate the trade-off between simplicity and accuracy when modeling complex real-world phenomena?'
'How do complex systems, consisting of many interconnected components, give rise to emergent properties that cannot be predicted from the characteristics of their individual parts?'
'How do complex systems, such as those found in nature and human societies, exhibit emergent properties that arise from the interactions of individual components?'

lexical

'What is the primary difference between a generative adversarial network (GAN) and a variational autoencoder (VAE) in deep learning?'
'What is the primary difference between a Decision Tree and a Random Forest in Machine Learning, and how do they alleviate overfitting?'
'What is the primary difference between a Bayesian neural network and a traditional feedforward neural network in the context of machine learning?'

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
# Run inference
preds = model("How doCompaniesbalanceIndividualCreativitywithTeamCollaboration to driveInnovationinthe WORKPlace?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	5	18.8511	32

Label	Training Sample Count
lexical	23
semantic	24

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0139	1	0.2662	-
0.6944	50	0.0007	-
1.0	72	-	0.0003
1.3889	100	0.0004	-
2.0	144	-	0.0001
2.0833	150	0.0003	-
2.7778	200	0.0002	-
3.0	216	-	0.0001
3.4722	250	0.0002	-
4.0	288	-	0.0001

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.6.1
Transformers: 4.39.0
PyTorch: 2.3.0+cu121
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}