Polite Guard
- Model type: BERT* (Bidirectional Encoder Representations from Transformers)
- Architecture: Fine-tuned BERT-base uncased
- Task: Text Classification
- Source Code: (https://github.com/intel/polite-guard)
- Dataset: (https://huggingface.co./datasets/Intel/polite-guard)
Polite Guard is an open-source NLP language model developed by Intel, fine-tuned from BERT for text classification tasks. It is designed to classify text into four categories: polite, somewhat polite, neutral, and impolite. This model, along with its accompanying datasets and source code, is available on Hugging Face* and GitHub* to enable both communities to contribute to developing more sophisticated and context-aware AI systems.
Use Cases
Polite Guard provides a scalable model development pipeline and methodology, making it easier for developers to create and fine-tune their own models. Other contributions of the project include:
- Improved Robustness: Polite Guard enhances the resilience of systems by providing a defense mechanism against adversarial attacks. This ensures that the model can maintain its performance and reliability even when faced with potentially harmful inputs.
- Benchmarking and Evaluation: The project introduces the first politeness benchmark, allowing developers to evaluate and compare the performance of their models in terms of politeness classification. This helps in setting a standard for future developments in this area.
- Enhanced Customer Experience: By ensuring respectful and polite interactions on various platforms, Polite Guard can significantly boost customer satisfaction and loyalty. This is particularly beneficial for customer service applications where maintaining a positive tone is crucial.
Description of labels
- polite: Text is considerate and shows respect and good manners, often including courteous phrases and a friendly tone.
- somewhat polite: Text is generally respectful but lacks warmth or formality, communicating with a decent level of courtesy.
- neutral: Text is straightforward and factual, without emotional undertones or specific attempts at politeness.
- impolite: Text is disrespectful or rude, often blunt or dismissive, showing a lack of consideration for the recipient's feelings.
Model Details
- Training Data: The model was trained on the Polite Guard Dataset utilizing Intel® Gaudi® Al accelerators. The training dataset consists of synthetically generated customer service interactions across various sectors, including finance, travel, food and drink, retail, sports clubs, culture and education, and professional development.
- Base Model: BERT-base, with 12 layers, 110M parameters.
- Fine-tuning Process: Fine-tuning was performed on the Polite Guard train dataset with the following hyperparameters using PyTorch Lightning*.
Hypeparameter | Batch size | Learning rate | Learning rate schedule | Max epochs | Optimizer | Weight decay | Precision |
---|---|---|---|---|---|---|---|
Value | 32 | 2.90e-5 | Linear warmup (10% of steps) | 2 | AdamW | 0.0003 | bf16-mixed |
Hyperparameter tuning was performed using Bayesian optimization with the Tree-structured Parzen Estimator (TPE) algorithm through Optuna* with 35 trials to maximize the validation F1-score. The hyperparameter search space included
learning rate: [1e-5, 5e-4]
weight decay: [1e-6, 1e-2]
The fine-tuning process used Optuna's pruning callback to terminate underperforming hyperparameter trials, and model checkpointing to save the best performing model states.
The code for the synthetic data generation and fine-tuning can be found here.
Metrics
Here are the key performance metrics of the model on the test dataset containing both synthetic and manually annotated data:
- Accuracy: 88.4% on the Polite Guard test dataset.
- F1-Score: 88.4% on the Polite Guard test dataset.
How to Use
You can use this model directly with a pipeline for categorizing text into classes polite, somewhat polite, neutral, and impolite.
from transformers import pipeline
classifier = pipeline("text-classification", model="Intel/polite-guard")
text = "Your input text"
print(classifier(text))
Articles
To learn more about the implementation of the data generator and fine-tuner packages, refer to
- Synthetic Data Generation with Language Models: A Practical Guide, and
- How to Fine-Tune Language Models: First Principles to Scalable Performance.
For more AI development how-to content, visit Intel® AI Development Resources.
Join the Community
If you are interested in exploring other models, join us in the Intel and Hugging Face communities. These models simplify the development and adoption of Generative AI solutions, while fostering innovation among developers worldwide. If you find this project valuable, please like ❤️ it on Hugging Face and share it with your network. Your support helps us grow the community and reach more contributors.
Disclaimer
Polite Guard has been trained and validated on a limited set of data that pertains to customer reviews, product reviews, and corporate communications. Accuracy metrics cannot be guaranteed outside these narrow use cases, and therefore this tool should be validated within the specific context of use for which it might be deployed. This tool is not intended to be used to evaluate employee performance. This tool is not sufficient to prevent harm in many contexts, and additional tools and techniques should be employed in any sensitive use case where impolite speech may cause harm to individuals, communities, or society.
Privacy Notice
Please note that the Polite Guard model uses AI technology and you are interacting with a chatbot. Prompts that are being used during the demo will not be stored. For information regarding the handling of personal data collected refer to the Global Privacy Notice (https://www.intel.com/content/www/us/en/privacy/intelprivacy-notice.html), which encompass our privacy practices.
*Other names and brands may be claimed as the property of others.
- Downloads last month
- 77
Model tree for Intel/polite-guard
Base model
google-bert/bert-base-uncasedDataset used to train Intel/polite-guard
Evaluation results
- Accuracy on polite-guardself-reported88.400
- F1 Score on polite-guardself-reported88.400