--- base_model: sentence-transformers/all-MiniLM-L6-v2 library_name: setfit metrics: - accuracy pipeline_tag: text-classification tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer widget: - text: How does technology impact our daily lives and what benefits can it bring to various activities? - text: How do organizations effectively deploy and manage machine learning algorithms to drive business value? - text: What are the key considerations for organizing and managing computer lab resources and tracking their status? - text: How can batch processing improve the efficiency of data lake operations? - text: What is the purpose of setting up a CUPS on a server? inference: true model-index: - name: SetFit with sentence-transformers/all-MiniLM-L6-v2 results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.8947368421052632 name: Accuracy --- # SetFit with sentence-transformers/all-MiniLM-L6-v2 This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co./sentence-transformers/all-MiniLM-L6-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co./sentence-transformers/all-MiniLM-L6-v2) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 256 tokens - **Number of Classes:** 2 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co./blog/setfit) ### Model Labels | Label | Examples | |:---------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | lexical |

"How does Happeo's search AI work to provide answers to user queries?"
'What are the primary areas of focus in the domain of Data Science and Analysis?'
'How can one organize a running event in Belgium?'

| | semantic |

'What changes can be made to a channel header?'
'How can hardware capabilities impact the accuracy of motion and object detections?'
'Who is responsible for managing guarantees and prolongations?'

| ## Evaluation ### Metrics | Label | Accuracy | |:--------|:---------| | **all** | 0.8947 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test") # Run inference preds = model("What is the purpose of setting up a CUPS on a server?") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:--------|:----| | Word count | 4 | 13.7407 | 28 | | Label | Training Sample Count | |:---------|:----------------------| | lexical | 44 | | semantic | 118 | ### Training Hyperparameters - batch_size: (32, 32) - num_epochs: (1, 1) - max_steps: -1 - sampling_strategy: oversampling - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - seed: 42 - eval_max_steps: -1 - load_best_model_at_end: True ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:-------:|:-------:|:-------------:|:---------------:| | 0.0020 | 1 | 0.4064 | - | | 0.0998 | 50 | 0.2177 | - | | 0.1996 | 100 | 0.0437 | - | | 0.2994 | 150 | 0.0057 | - | | 0.3992 | 200 | 0.0034 | - | | 0.4990 | 250 | 0.0009 | - | | 0.5988 | 300 | 0.0009 | - | | 0.6986 | 350 | 0.0007 | - | | 0.7984 | 400 | 0.0007 | - | | 0.8982 | 450 | 0.0009 | - | | 0.9980 | 500 | 0.0005 | - | | **1.0** | **501** | **-** | **0.1811** | * The bold row denotes the saved checkpoint. ### Framework Versions - Python: 3.10.12 - SetFit: 1.0.3 - Sentence Transformers: 2.6.1 - Transformers: 4.39.0 - PyTorch: 2.3.1+cu121 - Datasets: 2.18.0 - Tokenizers: 0.15.2 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```