---
library_name: setfit
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- text: >-
    How do companies balance individual creativity with team collaboration to
    drive innovation in the work place?
- text: >-
    How do the values of a learning organization impact its ability to innovate
    and respond to constant change?
- text: >-
    What is the primary function of the Domain Name System (DNS) layer in the
    Internet Protocol Stack, as defined by ICANN?
- text: >-
    What distinguishes a transforming industry from one that merely innovates to
    existing practices?
- text: >-
    How can artificial intelligence systems balance individual autonomy with
    collective responsibility in decision-making processes?
pipeline_tag: text-classification
inference: true
model-index:
- name: SetFit RAG query classificator for hybrid search query routing
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: Unknown
      type: unknown
      split: test
    metrics:
    - type: accuracy
      value: 1
      name: Accuracy
language:
- en
---

# Fast Query Routing for RAG Hybrid Search Using Setfit Tuned Embedding Model.

The goal of this model is to classify users queries in a RAG pipeline between two classes 'semantic' and 'lexical'. This allow an easy query routing in the context of hybrid search
and alpha tuning for hybrid search. A query is considered 'semantic' if it doesn't contain any particular jargon, proper noun, technical terms, ect.. on the other hand it is considered lexical
if there are precise keywords than can be used to make a lexical search (BM25 for example).

The model is very small and fast, thus enabling a very cost-effective approach for query routing comparing to use large LLMs such as GPT4 for query routing !

The model was trained using the [SetFit](https://github.com/huggingface/setfit) method that allows Text Classification model finetuning with a reduced number of human annotated training examples. This SetFit model uses [sentence-transformers/all-mpnet-base-v2](https://huggingface.co./sentence-transformers/all-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.

## Model Details

### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co./sentence-transformers/all-mpnet-base-v2)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 384 tokens
- **Number of Classes:** 2 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co./datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co./blog/setfit)

### Model Labels
| Label    | Examples                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|:---------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| lexical  | <ul><li>'What is the primary function of the Apache Kafka distributed streaming platform in Big Data processing?'</li><li>"What is the primary difference between Hadoop's FileSystem-based architecture and Apache Cassandra's distributed, masterlessArchitecture in scale-out design?"</li><li>'What is the main difference between optimistic concurrency control and pessimistic concurrency control in database management systems?'</li></ul> |
| semantic | <ul><li>"How does organizational morale impact the competitiveness of a company in today's fast-paced market?"</li><li>'How do organizations balance individual creativity with collective goal achievement in a dynamic environment?'</li><li>'What is a key challenge faced by managers in sustaining a work culture that encourages creativity, innovation, and critical thinking within the technological industry globally?'</li></ul>          |

## Evaluation

### Metrics
| Label   | Accuracy |
|:--------|:---------|
| **all** | 1.0      |

## Uses

### Direct Use for Inference

First install the SetFit library:

```bash
pip install setfit
```

Then you can load this model and run inference.

```python
from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
# Run inference
preds = model("What distinguishes a transforming industry from one that merely innovates to existing practices?")
```

<!--
### Downstream Use

*List how someone could finetune this model on their own dataset.*
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Set Metrics
| Training set | Min | Median  | Max |
|:-------------|:----|:--------|:----|
| Word count   | 4   | 19.1839 | 42  |

| Label    | Training Sample Count |
|:---------|:----------------------|
| lexical  | 43                    |
| semantic | 44                    |

### Training Hyperparameters
- batch_size: (8, 8)
- num_epochs: (3, 3)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: True

### Training Results
| Epoch   | Step     | Training Loss | Validation Loss |
|:-------:|:--------:|:-------------:|:---------------:|
| 0.0021  | 1        | 0.301         | -               |
| 0.1033  | 50       | 0.1244        | -               |
| 0.2066  | 100      | 0.0021        | -               |
| 0.3099  | 150      | 0.0006        | -               |
| 0.4132  | 200      | 0.0002        | -               |
| 0.5165  | 250      | 0.0002        | -               |
| 0.6198  | 300      | 0.0001        | -               |
| 0.7231  | 350      | 0.0001        | -               |
| 0.8264  | 400      | 0.0001        | -               |
| 0.9298  | 450      | 0.0001        | -               |
| 1.0     | 484      | -             | 0.0001          |
| 1.0331  | 500      | 0.0001        | -               |
| 1.1364  | 550      | 0.0001        | -               |
| 1.2397  | 600      | 0.0001        | -               |
| 1.3430  | 650      | 0.0           | -               |
| 1.4463  | 700      | 0.0001        | -               |
| 1.5496  | 750      | 0.0001        | -               |
| 1.6529  | 800      | 0.0001        | -               |
| 1.7562  | 850      | 0.0001        | -               |
| 1.8595  | 900      | 0.0           | -               |
| 1.9628  | 950      | 0.0           | -               |
| 2.0     | 968      | -             | 0.0001          |
| 2.0661  | 1000     | 0.0001        | -               |
| 2.1694  | 1050     | 0.0001        | -               |
| 2.2727  | 1100     | 0.0           | -               |
| 2.3760  | 1150     | 0.0           | -               |
| 2.4793  | 1200     | 0.0           | -               |
| 2.5826  | 1250     | 0.0           | -               |
| 2.6860  | 1300     | 0.0001        | -               |
| 2.7893  | 1350     | 0.0           | -               |
| 2.8926  | 1400     | 0.0001        | -               |
| 2.9959  | 1450     | 0.0           | -               |
| **3.0** | **1452** | **-**         | **0.0001**      |

* The bold row denotes the saved checkpoint.
### Framework Versions
- Python: 3.10.12
- SetFit: 1.0.3
- Sentence Transformers: 2.6.1
- Transformers: 4.39.0
- PyTorch: 2.3.1+cu121
- Datasets: 2.18.0
- Tokenizers: 0.15.2

## Citation

### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->