---
library_name: transformers
license: mit
datasets:
- coltekin/offenseval2020_tr
language:
- tr
pipeline_tag: text-classification
---

# atasoglu/turkish-base-bert-uncased-offenseval2020_tr

This is an offensive language detection model fine-tuned with [coltekin/offenseval2020_tr](https://huggingface.co./datasets/coltekin/offenseval2020_tr) dataset on [ytu-ce-cosmos/turkish-base-bert-uncased](https://huggingface.co./ytu-ce-cosmos/turkish-base-bert-uncased).

## Usage

Quick usage:

```py
from transformers import pipeline
pipe = pipeline("text-classification", "atasoglu/turkish-base-bert-uncased-offenseval2020_tr")
print(pipe("bu bir test metnidir.", top_k=None))
# [{'label': 'NOT', 'score': 0.9970345497131348}, {'label': 'OFF', 'score': 0.0029654440004378557}]
```

Or:

```py
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_id = "atasoglu/turkish-base-bert-uncased-offenseval2020_tr"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id).to(device)

@torch.no_grad
def predict(X):
    inputs = tokenizer(X, padding="max_length", truncation=True, max_length=256, return_tensors="pt")
    outputs = model.forward(**inputs.to(device))
    return torch.argmax(outputs.logits, dim=-1).tolist()

print(predict(["bu bir test metnidir."]))
# [0]
```

## Test Results

Test results examined on the *test* split of fine-tuning dataset.

|             |precision|recall|f1-score|support|
|------------:|:--------|:-----|:-------|:------|
|          NOT|0.9162   |0.9559|0.9356  |2812   |
|          OFF|0.7912   |0.6564|0.7176  |716    |

|             |         |      |        |       |
|------------:|:--------|:-----|:-------|:------|
|     accuracy|         |      |0.8951  |3528   |
|    macro avg|0.8537   |0.8062|0.8266  |3528   |
| weighted avg|0.8908   |0.8951|0.8914  |3528   |