Model Card for Model ID
This is a fine-tuned version of MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33 on a custom Turkish dataset for zero-shot classification tasks. The model is trained to classify textual inputs as entailment
or not_entailment
for multiple categories like Memnuniyet (user satisfaction), Şikayet (complaints), and others.
The model leverages the strong multilingual capabilities of DeBERTa-v3-base and is designed for fine-grained understanding of customer feedback and textual sentiment in the Turkish language.
Model Details
Model Description
- Developed by: yeniguno
- Model type: Natural Language Inference (NLI) / Zero-shot classification
- Language(s) (NLP): Turkish
- License: MIT
- Finetuned from model: MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33
Uses
- Analyzing customer feedback in Turkish.
- Zero-shot classification of user reviews into predefined categories.
Bias, Risks, and Limitations
The model is trained exclusively on Turkish data. It may not perform well for other languages. Biases from the training data (e.g., overrepresentation of certain categories) may affect performance. High reliance on proper hypothesis construction; ambiguous or irrelevant hypotheses may lead to suboptimal predictions.
How to Get Started with the Model
You can easily use the model for inference with the Hugging Face pipeline
functionality. Below is an example to classify the review:
from transformers import pipeline
CANDIDATE_LABELS = [
"Uygulama Performansı",
"Kullanıcı Arayüzü",
"Güncellemeler",
"Hata ve çökme",
"Reklam",
"Satın Alımlar",
"Müşteri Desteği",
"Abonelik",
"Memnuniyet",
"Özellikler",
"Şikayet"
]
# Load the pipeline
classifier = pipeline("zero-shot-classification", model="yeniguno/nli-zero-shot-reviews-MiniLM-turkish-v1")
text = "ChatGPT'nin abonelik ücreti çok yüksek ve müşteri desteği hiç yok, sorun yaşadığınızda yardım alabileceğiniz kimseyi bulamıyorsunuz."
response = classifier(text, CANDIDATE_LABELS, multi_label=True)
for label, score in zip(response["labels"], response["scores"]):
print(f"{label}: {score:.3f}")
Training Details
Training Data
A Turkish dataset with 951,093 samples, including customer reviews and predefined hypotheses. 0: Neutral (813,408 examples) 1: Entailment (137,685 examples)
Each hypothesis was constructed using the following candidate labels:
Turkish Label | Description |
---|---|
Memnuniyet | User satisfaction |
Şikayet | Complaints |
Özellikler | Features |
Satın Alımlar | Purchases |
Kullanıcı Arayüzü | User interface |
Uygulama Performansı | App performance |
Hata ve Çökme | Bugs and crashes |
Müşteri Desteği | Customer support |
Abonelik | Subscriptions |
Reklam | Advertisements |
Güncellemeler | Updates |
Training Hyperparameters
- Batch Size: 128
- Learning Rate: 2e-5
- Epochs: 3
- Label Smoothing Factor: 0.1
- Optimizer: AdamW
- Scheduler: Linear with warmup
- Loss Function: Weighted cross-entropy to address class imbalance.
- ...
Evaluation
Results
Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|---|---|
500 | 0.313 | 0.367 | 0.882 | 0.974 | 0.887 | 0.928 |
1000 | 0.291 | 0.268 | 0.884 | 0.981 | 0.881 | 0.929 |
2000 | 0.244 | 0.245 | 0.915 | 0.981 | 0.919 | 0.949 |
3000 | 0.220 | 0.244 | 0.923 | 0.982 | 0.927 | 0.954 |
5500 | 0.190 | 0.191 | 0.926 | 0.987 | 0.926 | 0.955 |
Key Observations
- The model consistently achieved an F1 score of ~95.5%, indicating a strong balance between precision and recall.
- Validation loss stabilized around
0.19
, demonstrating good generalization and minimal overfitting. - The model showed robust performance on customer feedback across various categories in Turkish, making it well-suited for zero-shot classification tasks.
Evaluation Dataset
- Dataset Size: 951,093 samples
- Split: ~85% training, ~15% validation
- Label Distribution:
Entailment
: 137,685 examplesNot Entailment
: 813,408 examples
- Sequence Length: 95th percentile of token lengths used (
max_length=116
).
- Downloads last month
- 120