|
--- |
|
library_name: transformers |
|
license: mit |
|
datasets: |
|
- coltekin/offenseval2020_tr |
|
language: |
|
- tr |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# atasoglu/turkish-base-bert-uncased-offenseval2020_tr |
|
|
|
This is an offensive language detection model fine-tuned with [coltekin/offenseval2020_tr](https://huggingface.co./datasets/coltekin/offenseval2020_tr) dataset on [ytu-ce-cosmos/turkish-base-bert-uncased](https://huggingface.co./ytu-ce-cosmos/turkish-base-bert-uncased). |
|
|
|
## Usage |
|
|
|
Quick usage: |
|
|
|
```py |
|
from transformers import pipeline |
|
pipe = pipeline("text-classification", "atasoglu/turkish-base-bert-uncased-offenseval2020_tr") |
|
print(pipe("bu bir test metnidir.", top_k=None)) |
|
# [{'label': 'NOT', 'score': 0.9970345497131348}, {'label': 'OFF', 'score': 0.0029654440004378557}] |
|
``` |
|
|
|
Or: |
|
|
|
```py |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model_id = "atasoglu/turkish-base-bert-uncased-offenseval2020_tr" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_id).to(device) |
|
|
|
@torch.no_grad |
|
def predict(X): |
|
inputs = tokenizer(X, padding="max_length", truncation=True, max_length=256, return_tensors="pt") |
|
outputs = model.forward(**inputs.to(device)) |
|
return torch.argmax(outputs.logits, dim=-1).tolist() |
|
|
|
print(predict(["bu bir test metnidir."])) |
|
# [0] |
|
``` |
|
|
|
## Test Results |
|
|
|
Test results examined on the *test* split of fine-tuning dataset. |
|
|
|
| |precision|recall|f1-score|support| |
|
|------------:|:--------|:-----|:-------|:------| |
|
| NOT|0.9162 |0.9559|0.9356 |2812 | |
|
| OFF|0.7912 |0.6564|0.7176 |716 | |
|
|
|
| | | | | | |
|
|------------:|:--------|:-----|:-------|:------| |
|
| accuracy| | |0.8951 |3528 | |
|
| macro avg|0.8537 |0.8062|0.8266 |3528 | |
|
| weighted avg|0.8908 |0.8951|0.8914 |3528 | |